davidebolo1993 / VISOR

VarIant SimulatOR for short, long and linked reads
GNU Lesser General Public License v3.0
41 stars 11 forks source link

Error for VISOR XENIA #20

Closed victor0104 closed 2 years ago

victor0104 commented 2 years ago

Hi, The following command was executed with attached files.Here is the error message.How can i fix this problem? Thank you. Command:VISOR XENIA -s hack.1.out/ -b HACk.h1.bed -o xenia.1.out Error message: [24/11/2021 08:16:38][Message][BETA] VISOR XENIA v1.1 [24/11/2021 08:16:39][Message] Preparing for bulk simulations with a single clone [24/11/2021 08:16:39][Message] Processing haplotype 1 [24/11/2021 08:16:39][Message] Simulating from region chr22:15000000-16000000 [24/11/2021 08:16:39][Message] Preparing simulation from /home/test/hack.1.out/h1.fa. Haplotype 1 [24/11/2021 08:16:39][Message] Number of available barcodes: 4792320 [24/11/2021 08:16:39][Message] Average number of paired reads per molecule: 53.333333333333336 [24/11/2021 08:16:39][Message] Number of reads required to get the expected coverage: 97455 [24/11/2021 08:16:39][Message] Expected number of molecules: 1827 [24/11/2021 08:16:39][Message] Molecules generated: 1827 [24/11/2021 08:16:39][Message] Assigned molecules to: 180 GEMs [24/11/2021 08:16:39][Message] Assigned a unique barcode to each molecule [24/11/2021 08:16:39][Message] Assigned a barcode to each molecule [24/11/2021 08:16:40][Message] 4792140 barcodes left [24/11/2021 08:16:40][Message] Simulating [24/11/2021 08:16:40][Message] Simulating from region chr22:20000000-21000000 [24/11/2021 08:16:40][Message] Preparing simulation from /home/test/hack.1.out/h1.fa. Haplotype 1 [24/11/2021 08:16:40][Message] Number of available barcodes: 4792140 [24/11/2021 08:16:40][Message] Average number of paired reads per molecule: 53.333333333333336 [24/11/2021 08:16:40][Message] Number of reads required to get the expected coverage: 100000 [24/11/2021 08:16:40][Message] Expected number of molecules: 1875 [24/11/2021 08:16:40][Message] Molecules generated: 1875 [24/11/2021 08:16:41][Message] Assigned molecules to: 193 GEMs [24/11/2021 08:16:41][Message] Assigned a unique barcode to each molecule [24/11/2021 08:16:41][Message] Assigned a barcode to each molecule [24/11/2021 08:16:41][Message] 4791947 barcodes left [24/11/2021 08:16:41][Message] Simulating [24/11/2021 08:16:41][Message] Simulating from region chr22:30000000-31000000 [24/11/2021 08:16:41][Message] Preparing simulation from /home/test/hack.1.out/h1.fa. Haplotype 1 [24/11/2021 08:16:41][Message] Number of available barcodes: 4791947 [24/11/2021 08:16:41][Message] Average number of paired reads per molecule: 53.333333333333336 [24/11/2021 08:16:41][Message] Number of reads required to get the expected coverage: 100000 [24/11/2021 08:16:41][Message] Expected number of molecules: 1875 [24/11/2021 08:16:41][Message] Molecules generated: 1875 [24/11/2021 08:16:42][Message] Assigned molecules to: 191 GEMs [24/11/2021 08:16:42][Message] Assigned a unique barcode to each molecule [24/11/2021 08:16:42][Message] Assigned a barcode to each molecule [24/11/2021 08:16:42][Message] 4791756 barcodes left [24/11/2021 08:16:42][Message] Simulating [24/11/2021 08:16:43][Message] Compressing FASTQ Traceback (most recent call last): File "/miniconda/envs/visorenv/bin/VISOR", line 33, in sys.exit(load_entry_point('VISOR==1.1', 'console_scripts', 'VISOR')()) File "/miniconda/envs/visorenv/lib/python3.8/site-packages/VISOR-1.1-py3.8.egg/VISOR/VISOR.py", line 221, in main args.func(parser, args) File "/miniconda/envs/visorenv/lib/python3.8/site-packages/VISOR-1.1-py3.8.egg/VISOR/VISOR.py", line 290, in run_subtool submodule.run(parser,args) File "/miniconda/envs/visorenv/lib/python3.8/site-packages/VISOR-1.1-py3.8.egg/VISOR/XENIA/XENIA.py", line 523, in run slices=Chunks(allfastq,math.ceil(chunk_size)) File "/miniconda/envs/visorenv/lib/python3.8/site-packages/VISOR-1.1-py3.8.egg/VISOR/XENIA/XENIA.py", line 121, in Chunks return [l[i:i+n] for i in range(0, len(l), n)] ValueError: range() arg 3 must not be zero

data_xenia.zip

davidebolo1993 commented 2 years ago

Hi @victor0104,

I just found a bug in XENIA which was caused by wrong string indexes being passed to the simulation function. This worked well for whole-genome simulations but for region-specific ones this caused indeed that sequences were empty (no simulated FASTQ in the end). Thanks for reporting. This should be solved through the latest master branch. I'm by now re-building the Docker image. As a case-specific suggestion, I think it's better if you provide XENIA with a single, larger region rather than using the initial BED for HACk. Something like the example below should work for your case.

echo -e "chr22\t14000000\t32000000" > xenia.bed

Let me know if I can help further.

Best,

Davide

davidebolo1993 commented 2 years ago

Hi @victor0104,

can you confirm this is solved ?

Thanks,

Davide