msettles / expHTS

Python application for "Experimental High Throughput Sequencing"
Apache License 2.0
5 stars 3 forks source link

Explicit clarification on folders/naming/etc. #53

Open bricesarver opened 8 years ago

bricesarver commented 8 years ago

This might be a documentation issue, but I've had issues where sometimes expHTS just won't work. It will run, take a fraction of a second, and produce empty results folders minus a few (empty or effectively empty) logs. Sometimes I've gone through and renamed folders, symlinks, etc. and things appear to work fine after that for seemingly no reason. This is especially prevalent when combining data from a lot of different sources and experiments, such as a trial sequencing run, initial sequencing effort, and additional coverage run that might have different naming schemes. Other times, no problems. Needs to be on the very front of the GitHub page. A recent example:

super_deduper found
sickle found
flash2 found
bowtie2 found
scythe found
___ PE COMMANDS ___
python /usr/local/lib/python2.7/dist-packages/expHTS-1.0.1.dev2-py2.7-linux-x86_64.egg/expHTS/cleanupWrapper.py  <(flash2 -Ti <(sickle pe -c <(super_deduper -i <(python /usr/local/lib/python2.7/dist-packages/expHTS-1.0.1.dev2-py2.7-linux-x86_64.egg/expHTS/extract_unmapped_reads.py  <(python /usr/local/lib/python2.7/dist-packages/expHTS-1.0.1.dev2-py2.7-linux-x86_64.egg/expHTS/screen.py -1 /scratch/1/brice/coding_expression/phylogenetic_data/test/00-RawData/caroli/caroli_10460_001_R1.fastq.gz,/scratch/1/brice/coding_expression/phylogenetic_data/test/00-RawData/caroli/caroli_1875_001_R1.fastq.gz  -2 /scratch/1/brice/coding_expression/phylogenetic_data/test/00-RawData/caroli/caroli_10460_001_R2.fastq.gz,/scratch/1/brice/coding_expression/phylogenetic_data/test/00-RawData/caroli/caroli_1875_001_R2.fastq.gz   -t 20 2>/dev/null)   -o stdout 2>02-Cleaned/caroli/PE_filter_info.log)  -p stdout 2>02-Cleaned/caroli/PE_deduper_info.log)   -m stdout -s /dev/null -t sanger -T  2>02-Cleaned/caroli/PE_sickle_info.log)   -M 700 --allow-outies -o caroli -d 02-Cleaned/caroli -To -c  2>02-Cleaned/caroli/flash_info.log)   0 0 50 02-Cleaned/caroli/caroli
02-Cleaned/caroli
Seconds: 0.0292479991913
Total amount of seconds to run all samples
Seconds: 0.0292479991913
bricesarver commented 8 years ago

I still think this needs to be addressed, but I have permuted combinations of files and arguments and will open another thread that might shine some light on a deeper issue.

bricesarver commented 8 years ago

Additionally, the column headers in the logs could be made more informative and/or an explanation could be given in the markdown splash or README.