laurahspencer / DuMOAR

0 stars 0 forks source link

Confirm two lanes per sample for MBD-BS data #11

Open laurahspencer opened 1 year ago

laurahspencer commented 1 year ago

The MBD-BS directory on Sedna has two subdirectories, "4416" and " 4417" with two fastq.gz files per sample (R1 and R2). The filenames in both subdirectories are identical but file sizes differ, for example:

[lspencer@sedna MBDseq_UOdata]$ du -sh 4416/CH01_06* 4417/CH01_06*
617M    4416/CH01_06_S1_L001_R1_001.fastq.gz
551M    4416/CH01_06_S1_L001_R2_001.fastq.gz
721M    4417/CH01_06_S1_L001_R1_001.fastq.gz
704M    4417/CH01_06_S1_L001_R2_001.fastq.gz

I presume each sample was run across two lanes but it's weird that all file names include "L001". @mgavery were samples run across two lanes?

sr320 commented 1 year ago

Related - I think it would be great to document provenance of data from sequence facility (including checksums) - per https://github.com/laurahspencer/DuMOAR/issues/10