Closed sr320 closed 4 years ago
updated Meth_Compare_Pipeline.md with MD5 for trimmed files.
MD5sum.txt files can be found here:
On Gannet:
Rsync from Mox to Gannet log here: https://gannet.fish.washington.edu/metacarcinus/FROGER_meth_compare/20200311/mox2gannet_rsync.log
Or they are still currently on Mox here:
/gscratch/scrubbed/strigg/analyses/20200311/RRBS/md5sum.txt
/gscratch/scrubbed/strigg/analyses/20200311/WGBS_MBD/md5sum.txt
@sr320 you also commented "WOULD LOVE TO SEE CONCRETE VALIDATION ON TRIMMING ABOVE"
What are you hoping for in terms of validation? Do we need to re-open this issue? https://github.com/hputnam/Meth_Compare/issues/14
Just that someone third party would need to read Meth_Compare_Pipeline.md and be convinced of both ability to reproduce and that trimming was done properly (eg by visualizing something in the markdown file.)
Also I would say the reader cannot easily see / verify that the md5s of files you trimmed match genewiz md5s.
@sr320 I can post examples of fastqc sequence diversity plots before and after trimming in the markdown file. Would that help or is there a better way to show this? Or should the validation go in a separate file?
for your second point, I'm confused. can we chat?
this has been completed and the Meth_Compare_Pipeline.md has been updated.
Adding this for potential future reference...
@shellytrigg I looked at your code and you can greatly simplify this in the future by using the built-in --check
argument of the md5sum
program.
Basically, this is how the whole process would go:
# Change to directory with files that need checksums
cd working_dir
# Generate checksums
md5sum check_this_file.fastq.gz > checksums.md5
# Verify checksums at a later date
md5sum --check checksums.md5
md5sum
can use a checksum file (which contains a list of files and their corresponding checksums) as a means to verify checksums. The output will be a list of the filenames and an indication of pass/fail.
on Meth_Compare_Pipeline.md