HadrienG / InSilicoSeq

:rocket: A sequencing simulator
https://insilicoseq.readthedocs.io
MIT License
181 stars 32 forks source link

FileNotFoundError in concatenate #269

Open jjkoehorst opened 1 month ago

jjkoehorst commented 1 month ago
iss generate --draft GCF_000196335.1_ASM19633v1_genomic.fna -o GCF_000196335.1 --cpus 6 --model novaseq --coverage_file coverage.txt 

INFO:iss.app:Starting iss generate
INFO:iss.generator:Using kde ErrorModel
INFO:iss.util:Stitching input files together
WARNING:iss.generator:--coverage_file is an experimental feature
WARNING:iss.generator:--coverage_file disables --n_reads
INFO:iss.generator:Using coverage file:coverage.txt
INFO:iss.app:Using 6 cpus for read generation
INFO:iss.app:Generating 1000000 reads
INFO:iss.util:Stitching input files together
Traceback (most recent call last):
  File "/Users/koeho006/git/ssb/iBioSystems_Reader/data/venv/bin/iss", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/koeho006/git/ssb/iBioSystems_Reader/data/venv/lib/python3.12/site-packages/iss/app.py", line 454, in main
    args.func(args)
  File "/Users/koeho006/git/ssb/iBioSystems_Reader/data/venv/lib/python3.12/site-packages/iss/app.py", line 126, in generate_reads
    util.concatenate(temp_R1, args.output + "_R1.fastq")
  File "/Users/koeho006/git/ssb/iBioSystems_Reader/data/venv/lib/python3.12/site-packages/iss/util.py", line 233, in concatenate
    with open(file_name, "rb") as f:
         ^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'GCF_000196335.1.iss.tmp.4_R1.fastq'

No idea why it tries to merge that file as indeed it does not exist....

jjkoehorst commented 1 month ago

Are the files that have been generated

GCF_000196335.1.iss.tmp.0.vcf         GCF_000196335.1.iss.tmp.1.vcf         GCF_000196335.1.iss.tmp.2.vcf         GCF_000196335.1.iss.tmp.3.vcf         GCF_000196335.1.iss.tmp.genomes.fasta
GCF_000196335.1.iss.tmp.0_R1.fastq    GCF_000196335.1.iss.tmp.1_R1.fastq    GCF_000196335.1.iss.tmp.2_R1.fastq    GCF_000196335.1.iss.tmp.3_R1.fastq
GCF_000196335.1.iss.tmp.0_R2.fastq    GCF_000196335.1.iss.tmp.1_R2.fastq    GCF_000196335.1.iss.tmp.2_R2.fastq    GCF_000196335.1.iss.tmp.3_R2.fastq

Maybe the count is one off?

jjkoehorst commented 1 month ago

It seems to be a core issue, when running it on 1 thread it works :)

ThijsMaas commented 1 month ago

Hi, thanks for sending in this issue. The experimental coverage mode did not correctly calculate the total number of reads that will be generated, which resulted in an incorrect number and size of multi-processing work chunks to be generated!

I'm working on a fix for this issue immediately.