Closed emattei closed 1 year ago
Human Completed Mouse needs new tss file and new ccREs. Index should be ready.
This affects DORCs wf - we have a .Rdata file that contains hg19/hg38/mm10 TSSRanges in granges
objects. This needs to be modified either to include the new TSSRanges or pass bed file as input.
TSS bed files were created using transcripts from TxDb.mm39 and TxDb.hg38 R packages.
hg38: gs://broad-buenrostro-pipeline-genome-annotations/GRCh38/genes_annotations/hg38.TxDb_transcripts.TSS.bed mm39: gs://broad-buenrostro-pipeline-genome-annotations/mm39/gene-annotations/mm39.TxDb_transcripts.TSS.bed
How does this compare with the hg38 TSS bed file we already have? (Are we not going to use it anymore?)
On Fri, May 12, 2023 at 12:15 PM Siddarth Wekhande @.***> wrote:
TSS bed files were created using transcripts from TxDb.mm39 and TxDb.hg38 R packages.
hg38: gs://broad-buenrostro-pipeline-genome-annotations/GRCh38/genes_annotations/hg38.TxDb_transcripts.TSS.bed mm39: gs://broad-buenrostro-pipeline-genome-annotations/mm39/gene-annotations/mm39.TxDb_transcripts.TSS.bed
— Reply to this email directly, view it on GitHub https://github.com/broadinstitute/epi-SHARE-seq-pipeline/issues/85#issuecomment-1545982099, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAK2EW6GNGPNNJENMEIVOELXFZO35ANCNFSM6AAAAAAXERPVCY . You are receiving this because you were assigned.Message ID: @.***>
-- Neva Cherniavsky Durand, Ph.D. | she, her, hers Senior Scientist | Gene Regulation Observatory Broad Institute of MIT and Harvard
We haven't swapped it yet on the genome TSS. It's reads and it needs some more testing but at least everyone knows where it is
Are this packages consistent with the gtfs from synapse?
On Fri, May 12, 2023 at 12:39 PM Eugenio Mattei @.***> wrote:
We haven't swapped it yet on the genome TSS. It's reads and it needs some more testing but at least everyone knows where it is
— Reply to this email directly, view it on GitHub https://github.com/broadinstitute/epi-SHARE-seq-pipeline/issues/85#issuecomment-1546010766, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAK2EW3WUMHBQGOYMSCRC2TXFZRS3ANCNFSM6AAAAAAXERPVCY . You are receiving this because you were assigned.Message ID: @.***>
-- Neva Cherniavsky Durand, Ph.D. | she, her, hers Senior Scientist | Gene Regulation Observatory Broad Institute of MIT and Harvard
It depends on what you mean by "consistent". These are published packages that define the hg38 and mm39 transcripts, exons, and genes. We could define that information from the synapse gtf files but there are a lot of decisions to make regarding how to select the right transcripts/genes, and then test them.
These are now completed.
Human hg38 with v43 GTF gs://broad-buenrostro-pipeline-genome-annotations/IGVF_human_v43/Homo_sapiens_genome_files_hg38_v43.tsv
Mouse mm39 with v32 GTF gs://broad-buenrostro-pipeline-genome-annotations/IGVF_mouse_v32/Mus_musculus_genome_files_mm39_v32.tsv
currently, no cCREs are present for mm39. A possible solution would be to use a liftOver?
makes sense to me
mm39 ccre bed file: gs://broad-buenrostro-pipeline-genome-annotations/mm39/mm39_ccre_liftover_resized.bed
This file was created using UCSC liftOver from mm10 (mm10 ccre bed file) to mm39. 8 coordinates failed to liftover, and 39 coordinates had to be resized to 300bp.
https://www.synapse.org/#!Synapse:syn39048501
Gencode v43 for human and M32 for mouse