GoekeLab / sg-nex-data

Nanopore RNA-Seq data from the Singapore Nanopore-Expression Project
97 stars 24 forks source link

Naming of samples #55

Open Kiliankleemann opened 7 months ago

Kiliankleemann commented 7 months ago

I#m having trouble to understand the naming convention of samples. Some cell lines have missing replicates (e.g. only 1,5,6) and some replicates have only run1 or run2. Would greatly appreciate your help on how to process the samples.

cying111 commented 7 months ago

Hi @Kiliankleemann , thanks for letting us know your doubts.

The naming convention is actually based on our lab sample generation,

  1. the replicate number "Rep", is named based on the biological replicate that is used, and across ONT protocols, so it might be that, for certain replicates, it's generated using all ONT protocols, but for some replicates, it's only available in some of the ONT protocols, but not all ONT protocols
  2. similarly for the technical replicate number "Run", this is named based on the technical run, it's also named across ONT protocols, so that's why, you see run number not being continuous within one ONT protocol, but if you check across ONT protocols, then it will be continuous, and it should be complete.

For Illumina matched samples, we just selected three replicates of all replicates generated using ONT protocols for each core cell line, so you might find missing replicates here for short read data.

Please let me if this answers your question.

Also, welcome any suggestion on the naming convention that we currently adopted, and we are happy to discuss more about this.

Thank you Regards, Ying

Kiliankleemann commented 7 months ago

Hi Ying, thank you for your response. I am a bit confused still since I downloaded all fastq data and for example in the direct RNAseq for the H9 cells I se run 1 and 2 for all biological replicates, except for replicate 1: /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_H9_directRNA_replicate1_run1 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_H9_directRNA_replicate2_run1 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_H9_directRNA_replicate2_run2 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_H9_directRNA_replicate3_run1 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_H9_directRNA_replicate3_run2 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_H9_directRNA_replicate4_run1 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_H9_directRNA_replicate4_run2

For the Hct116 cells its more patchy: /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate1_run1 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate1_run2 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate1_run3 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate2_run1 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate2_run2 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate2_run3 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate2_run4 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate2_run5 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate2_run6 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate3_run1 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate3_run4 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate4_run3 /media/kilian/My Book/ONT_longread_singapore/direct_RNA/raw_data/SGNex_Hct116_directRNA_replicate6_run1