refresh-bio / SPLASH

57 stars 6 forks source link

Error: unknown exception Command exited with non-zero status 1 #6

Closed cecile-meier-scherling closed 1 year ago

cecile-meier-scherling commented 1 year ago

Hello, I have tried running splash with a couple of different fastq datasets, but I am running into the following error:

Error: unknown exception
Command exited with non-zero status 1

I have formatted the input file as described, where each sample is per line, and each line contains the sample name and the path to the sample separated by a space. The path to the input sample files is with respect to the working directory. Currently, my input file contains 16 fastq files.

I have tried to debug this error by running splash on different fastq files. However, I am getting the same error which makes me think there may be an issue with the formatting of my input file.

Does anyone else have experienced this error while running splash?

marekkokot commented 1 year ago

Hello, thank you for reporting. According to your description, I think you are doing everything right. Could you please try to prepare some fake fastq files (just with a couple of arbitrary reads) and check if it also fails and if so, share these files, your input, and the exact command line you are using?

You may also try using the content of the example directory. There is download.py script to download exemplary input data and the input.txt and you may run splash on this and let me know if this also fails.

BTW. I think this error cames from KMC which is a part of SPLASH pipeline, and it occurs in most cases when there is something wrong with the input. For example if one uses fasta instead of fastq files this may occur, but as I understand you are using fastq files.

Anyway thank you for reporting and let me know if more info is needed at this point and how the above went.

cecile-meier-scherling commented 1 year ago

Thank you, @marekkokot for your response! I should have mentioned that the example data works fine, and I don't run into any issues. I will make some fake fastq files and run splash and will keep you updated.

marekkokot commented 1 year ago

Ok, if it works on fake files, maybe you could take a head of your real files and check them. BTW. how big are your input files? Maybe there is a way to share them?

cecile-meier-scherling commented 1 year ago

Hello, I created some fake fastq files using the fastq_generator ([https://github.com/johanzi/fastq_generator]) and ran SPLASH, which gave me the same error as before. I have attached my input file and fake fastq files.

fake_fastq_input.txt test3.fastq.gz test2.fastq.gz test1.fastq.gz

This is my output from SPLASH

Starting stage 1
Current time: 2023-07-13 15:40:51
Error running command: /usr/bin/time -v /gpfs/runtime/opt/splash/2.1.4/bin/kmc -t8 -b -ci1 -cs65535 -k54  -m12 t1.fastq.gz splash-tmp-7d48568c1a4f4f0681ef9ce853d7618f/t1 splash-tmp-7d48568c1a4f4f0681ef9ce853d7618f/kmc_tmp_t1
For details check logs/stage_1_thread-0000.log
Error running command: /usr/bin/time -v /gpfs/runtime/opt/splash/2.1.4/bin/kmc -t8 -b -ci1 -cs65535 -k54  -m12 t2.fastq.gz splash-tmp-7d48568c1a4f4f0681ef9ce853d7618f/t2 splash-tmp-7d48568c1a4f4f0681ef9ce853d7618f/kmc_tmp_t2
For details check logs/stage_1_thread-0001.log
Error running command: /usr/bin/time -v /gpfs/runtime/opt/splash/2.1.4/bin/kmc -t8 -b -ci1 -cs65535 -k54  -m12 t3.fastq.gz splash-tmp-7d48568c1a4f4f0681ef9ce853d7618f/t3 splash-tmp-7d48568c1a4f4f0681ef9ce853d7618f/kmc_tmp_t3
For details check logs/stage_1_thread-0002.log
Exiting because of previous error

It seems like the fastq files cannot be read in the current format they are.

marekkokot commented 1 year ago

Hi, thanks. I downloaded but it seems these files are not gzipped, although they have .gz in name. In such case SPLASH will try to gunzip them. Maybe this is the reason. If your files are in fact plain text and not .gz try to rename it to just be *.fastq. Let me know if this helps.

cecile-meier-scherling commented 1 year ago

Thank you very much - that solved the issue!