JaneliaSciComp / msg

Multiplexed Shotgun Genotyping
http://genomics.princeton.edu/AndolfattoLab/MSG.html
11 stars 12 forks source link

Parsing - If an individual has no reads, produce error message and continue instead of dying. #31

Closed gregpinero closed 12 years ago

gregpinero commented 12 years ago

If an individual has no reads, produce error message and continue instead of dying.

Original Email:

When parse_and_map hits a file that has no reads, it freaks. See error code below. In this run, I had three parsed individuals with no reads, and all three crashed. This causes an Eqw and perhaps other downstream havoc. We should check each file for reads and let it exit gracefully if no reads are present in the file.

total_reads 90423405[login - sternd@e02u21]~/KellyDavid>cat msgRun2.7901.e5681973.24 hostname: Undefined variable. Unknown option: recrate parse_and_map.py: ERROR: Caught exception Traceback (most recent call last): File "/misc/local/msg/cmdline/cmdline.py", line 162, in run exit_code = self.main(*main_args) File "msg/parse_and_map.py", line 136, in main self.map() File "msg/parse_and_map.py", line 351, in map file_par2_log, misc_indiv_log) File "msg/parse_and_map.py", line 297, in _map_w_stampy shell=True, stdout=misc_indiv_log, stderr=misc_indiv_log) File "/usr/local/msg/lib/python2.6/subprocess.py", line 488, in check_call raise CalledProcessError(retcode, cmd) CalledProcessError: Command '['samtools view -btSh -o ./Sample_KML.fastq.gz.trim.fastq_sam_files/aln_indivSAN_1482_GAGCAC_par1.sam.bam ./Sample_KML.fastq.gz.trim.fastq_sam_files/aln_indivSAN_1482_GAGCAC_par1.sam']' returned non-zero exit status 1 Error in python msg/parse_and_map.py -i Sample_KML.fastq.gz.trim.fastq -b BC3_DLS_BCbarcode.csv.24 --parent1 parent1_ref.fa --parent2 parent2_ref.fa --map-only --re_cutter --linker_system --bwa_alg aln --bwa_threads 8 --use_stampy 1 --stampy_premap_w_bwa 1 --indiv_stampy_substitution_rate 0.01 --indiv_mapq_filter 20: 256 at msg/msg.pl line 13.

gregpinero commented 12 years ago

This may be ok.

Relevant email:

Strangely, the Eqw jobs that I thought were due to no reads restarted automatically. Turns out these were not the 0 read jobs and those do not appear to have caused problems. Also, some of the 0 read jobs were negative controls, which worked very well. The files with 0 reads produce sam files with no matches and no hmm_data folders. Which is all OK. So, as far as I can tell, we do not have a bug related to individuals with 0 reads. Let's scratch this issue.