RobertsLab / resources

https://robertslab.github.io/resources/
18 stars 10 forks source link

BEDfiles for Cgigas ATAC/TSS data #1657

Closed yaaminiv closed 4 months ago

yaaminiv commented 1 year ago

Data here: https://gannet.fish.washington.edu/metacarcinus/Cgigas/shelly/

First step is probably ensuring that there are BEDfiles for each data type (ATAC-Seq, csRNA-seq, 5'GRO, mSTART). Next step is figuring out what the data tells us.

Goal is to intersectBED these files with DML lists and understand how methylation is impacted by TSS accessibility, etc.


Additional information from Sascha:

Step by step: 5'GRO similar to csRNA-seq gives you the TSS. for csRNA-seq, you need an input library. http://homer.ucsd.edu/homer/ngs/csRNAseq/index.html is the tutorial

here a quick warning: the data passed QC. But was processed in bulk with 50 species. A careful evaluation from an oyster expert may be getting even better TSS calls etc out of the data as Chris Benner (who did this analysis) used a script to set thresholds etc.

You find data for two individuals. for which i differentially took tissue from the branchies and the "muscle/stick" aka Coeur. in the csRNA-seq data, Oyster_8 and Oyster_9 is the Coeur. Again, my apologies for the sloppy labeling but I did about 50 species back then and we went through different "automated" systems... aka codes make sense to us but noone else.

as for the RNA-seq data, sorry they are badly labeled. The JHS is the identifier # but if you want to match the RNA-seq with the csRNA-seq (i.e. to call stable or unstable TSS):

JHS 741 | Oyster1_Branchia_6_Ribo0_Core -- | -- JHS 742 | Oyster2_Branchia_7_Ribo0_Core JHS 743 | Oyster1_Coeur_8__Ribo0_Core JHS 744 | Oyster2_Coeur_9__Ribo0_Core
sr320 commented 1 year ago

Started to look at this.... https://github.com/sr320/nb-2023/blob/main/Cgigas/code/02-ATAC.md

big issue is different version of genome is used. This all pre Roslin genome work

yaaminiv commented 1 year ago

Is there an easy way to map the old genome to the Roslin genome? Alternatively, I have link that includes fasta files for all sequences that we can map to the Roslin genome.

From Sascha:

I uploaded the whole directory here:

http://homer.ucsd.edu/sduttke/share/oyster.tar

It's a huge file (~200gb + I think).

If you want to just start/use the fastq directory. here you find:

5GRO-DB107-180430-oyster_8_5GRO_DB107_CATGGCAT_S33_R1_001.fastq.gz ATAC-JHS696-170217-Oyster1_Brachies_6--ATAC-JHS696_S7_R1_001.fastq.gz ATAC-JHS697-170223-Oyster2_7_Brachies--ATAC-JHS697_S16_R1_001.fastq.gz ATAC-JHS698-170217-Oyster1_Coeur_8--ATAC-JHS698_S8_R1_001.fastq.gz ATAC-JHS699-170223-Oyster2_9_Coeur--ATAC-JHS699_S17_R1_001.fastq.gz csRNAinput-JHS718-170228-Oyster1_6_Branchies--mSTART_input-JHS718_S20_R1_001.fastq.gz csRNAinput-JHS719-170228-Oyster2_7_Branchies--mSTART_input-JHS719_S2_R1_001.fastq.gz csRNAinput-JHS796-170323-Oyster_8--mSTART_input-JHS796_S5_R1_001.fastq.gz csRNAinput-JHS797-170323-Oyster_9--mSTART_input-JHS797_S6_R1_001.fastq.gz csRNA-JHS670-170224-Oyster1_6_branchies--mSTART-JHS670_S12_R1_001.fastq.gz csRNA-JHS671-170228-Oyster2_7_Branchies--mSTART-JHS671_S13_R1_001.fastq.gz csRNA-JHS817-170322-Oyster_8--mSTART-JHS817_S15_R1_001.fastq.gz csRNA-JHS818-170322-Oyster_9--mSTART-JHS818_S16_R1_001.fastq.gz old RNA-JHS741_R1.fastq.gz RNA-JHS741_R2.fastq.gz RNA-JHS742_R1.fastq.gz RNA-JHS742_R2.fastq.gz RNA-JHS743_R1.fastq.gz RNA-JHS743_R2.fastq.gz RNA-JHS744_R1.fastq.gz RNA-JHS744_R2.fastq.gz

yaaminiv commented 1 year ago

@shellywanamaker Did you ever download the FASTA/FASTQ files from Sasha's dataset? We only see bedgraphs in your folder on gannet

shellywanamaker commented 1 year ago

@yaaminiv Sasha didn't send me the fasta/fastq files; all he sent me was what is in the folder on Gannet: https://gannet.fish.washington.edu/metacarcinus/Cgigas/shelly/. Here's the email exchange we had and the folder he shared was called shelly.tar which is what the folder on Gannet is also just called shelly. The folder you posted above with the Fastqs is called oyster.tar and I see that the link is broken. If the data was never downloaded, perhaps it's worth asking for him to resend, I'm sure he wouldn't mind