smithlabcode / falco

A C++ drop-in replacement of FastQC to assess the quality of sequence read data
https://falco.readthedocs.io
GNU General Public License v3.0
96 stars 10 forks source link

output is overriden when multiple fastq files are provided. #10

Open guilhermesena1 opened 3 years ago

guilhermesena1 commented 3 years ago

If multiple inputs are given and the -o flag sets a directory name, only the results for the last file shows up. This is due to each file overriding the previous because they are all called output_dir/fastqc_data.txt.

Fastqc zips each report. We should create subdirectories within the output directory, one for each file name, but only if more than one file is provided.

guilhermesena1 commented 3 years ago

e662544641098c9f4b76869a17a921340c4c958e should fix this.

kevin-wamae commented 1 year ago

Hi @guilhermesena1, the bug seems to have persisted with v1.2.1 installed from conda.

I analysed two fastq files and redirected the output to a directory.

falco writes to the same files and fails to add prefixes.

falco fastq/* --outdir falco_out [Tue Nov 29 09:11:02 2022] creating directory for output: falco_out [limits] using file /home/user/miniconda3/envs/genomics/opt/falco/Configuration/limits.txt [adapters] using file /home/user/miniconda3/envs/genomics/opt/falco/Configuration/adapter_list.txt [contaminants] using file /home/user/miniconda3/envs/genomics/opt/falco/Configuration/contaminant_list.txt [Tue Nov 29 09:11:02 2022] Started reading file fastq/pfs-1_R1.fastq.gz [Tue Nov 29 09:11:02 2022] reading file as gzipped FASTQ format [running falco|===================================================|100%] [Tue Nov 29 09:11:02 2022] Finished reading file [Tue Nov 29 09:11:02 2022] Writing summary to falco_out/_summary.txt [Tue Nov 29 09:11:02 2022] Writing text report to falco_out/_fastqc_data.txt [Tue Nov 29 09:11:02 2022] Writing HTML report to falco_out/_fastqc_report.html Elapsed time for file fastq/pfs-1_R1.fastq.gz: 0s [limits] using file /home/user/miniconda3/envs/genomics/opt/falco/Configuration/limits.txt [adapters] using file /home/user/miniconda3/envs/genomics/opt/falco/Configuration/adapter_list.txt [contaminants] using file /home/user/miniconda3/envs/genomics/opt/falco/Configuration/contaminant_list.txt [Tue Nov 29 09:11:02 2022] Started reading file fastq/pfs-1_R2.fastq.gz [Tue Nov 29 09:11:02 2022] reading file as gzipped FASTQ format [running falco|===================================================|100%] [Tue Nov 29 09:11:02 2022] Finished reading file [Tue Nov 29 09:11:02 2022] Writing summary to falco_out/_summary.txt [Tue Nov 29 09:11:02 2022] Writing text report to falco_out/_fastqc_data.txt [Tue Nov 29 09:11:02 2022] Writing HTML report to falco_out/_fastqc_report.html Elapsed time for file fastq/pfs-1_R2.fastq.gz: 0s

guilhermesena1 commented 1 year ago

Hello,

Thank you for reporting the issue. Thar's really strange.

From the empty names before the underscore I think this may be an issue with how filenames are escaped.

Would you be able to just answer one question and run two quick tests?

(1) question: which operating system and shell are you using (bash? zsh?)

(2) does the program behave as expected if you change the command to

falco $(ls -1 fastq | tr '\n' ' ') --outdir falco_out

(3) does the program behave as expected if you switch the order of the arguments and put the filenames at the end? (that's how the program should be run) i.e.

falco --outdir falco_out fastq/*

Either way I'll try to reproduce as soon as I have access to my computer again.

Thank you!

guilhermesena1 commented 1 year ago

I pushed a fixed at the most recent commit ( a97182e ). Unfortunately this fix is only available for now by cloning and compiling the repo. I'll leave this open until we create a new release and conda update to fix this problem. Thanks for reporting again!

kevin-wamae commented 1 year ago

Thanks @guilhermesena1 and sorry for the late response.

I cloned the directory, run make all and got this error:

I'm using Ubuntu Version="20.04.5 LTS"

make[1]: Entering directory '/home/kwamae/software/falco/src' g++ -Wall -std=c++11 -O3 -c -o FalcoConfig.o FalcoConfig.cpp -DPROGRAM_PATH=\"/home/kwamae/software/falco\" g++ -Wall -std=c++11 -O3 -c -o FastqStats.o FastqStats.cpp -DPROGRAM_PATH=\"/home/kwamae/software/falco\" g++ -Wall -std=c++11 -O3 -c -o HtmlMaker.o HtmlMaker.cpp -DPROGRAM_PATH=\"/home/kwamae/software/falco\" g++ -Wall -std=c++11 -O3 -c -o Module.o Module.cpp -DPROGRAM_PATH=\"/home/kwamae/software/falco\" g++ -Wall -std=c++11 -O3 -c -o OptionParser.o OptionParser.cpp -DPROGRAM_PATH=\"/home/kwamae/software/falco\" g++ -Wall -std=c++11 -O3 -c -o smithlab_utils.o smithlab_utils.cpp -DPROGRAM_PATH=\"/home/kwamae/software/falco\" g++ -Wall -std=c++11 -O3 -c -o StreamReader.o StreamReader.cpp -DPROGRAM_PATH=\"/home/kwamae/software/falco\" In file included from StreamReader.cpp:16: StreamReader.hpp:23:10: fatal error: zlib.h: No such file or directory 23 | #include | ^~~~ compilation terminated. make[1]: [Makefile:43: StreamReader.o] Error 1 make[1]: Leaving directory '/home/kwamae/software/falco/src' make: [Makefile:20: all] Error 2

andrewdavidsmith commented 1 year ago

Quick look suggests you're missing zlib. See the README.md file for installation instructions of dependencies. Should be straightforward but let us know.

kevin-wamae commented 1 year ago

Thanks, @andrewdavidsmith. :man_facepalming:

@guilhermesena1, thanks for the fix. It works fine now.