berman-lab / ymap

YMAP - Yeast Mapping Analysis Pipeline : An online pipeline for the analysis of yeast genomic datasets.
MIT License
6 stars 6 forks source link

process_input_files.php - $paired undefined in some branches #13

Closed vladimirg closed 9 years ago

vladimirg commented 9 years ago

When uploading a single-end read, zipped FASTQ file, the following warning is generated:

PHP Notice: Undefined variable: paired in /Users/bermanlab/dev/ymap/process_input_files.php on line 273

Purely technically, it seems that before the last if block, we should define $paired = 0, and remove all calls to exit within the block.

One thing I don't understand is why some conditions end up as paired and others are not. There seems to be no code that checks if txt or SAM/BAM files are paired, they are just assumed to be so. Why?

darrenabbey commented 9 years ago

"Paired" refers to paired-end reads, in opposition to single-end reads. The only difference in processing between the two data types is in read alignment with bowtie2. The command options and input syntax differs between paired- and single-end reads, so the distinction has to be made up until that point.

The TXT file format has alignment information built in and are directly translated to the intermediate files ("putative_SNPs_v4.txt" and "SNP_CNV_v1.txt") used later in the pipelines. Since the data from this format is inserted into the pipeline after bowtie2 is used, there is no need to check or specify the paired/single distinction. A standard assumption is used however, so the

The SAM/BAM file format is deconstructed into paired-end fastQ files and then fed into bowtie2 for re-alignment. This is done so you can process SAM/BAM files that were aligned vs. a different version of the genome than you're working with. There are a couple conditions that need to be checked:

1) single-end fastQ => SAM/BAM => reformed into fastQ
2) paired-end fastQ => SAM/BAM => reformed into fastQ

Right now, the conversion assumes output of paired fastQ files from any SAM/BAM file input, but I'm not clear if this is real.

darrenabbey commented 9 years ago

$paired should definitely be defined in all cases.

The exit conditions are for cases where the input data cannot be processed and so an error file should be generated for display in the user interface element for that project, and the server-side processing should then be terminated.

darrenabbey commented 9 years ago

$paired has been defined in previously undefined conditions. Exit conditions are retained.