bcgsc / NanoSim

Nanopore sequence read simulator
Other
246 stars 57 forks source link

`fit_lognorm` expects `_base_qualities_model_parameters.tsv`, however the characterization stage does not return this file #227

Closed ezherman closed 1 month ago

ezherman commented 1 month ago

I am running the following command:

simulator.py metagenome -gl results/config/test/metagenome_list_for_simulation.tsv -a results/config/test/abundance_for_simulation.tsv -dl results/config/test/dna_type_list.tsv -c results/models/sample/training -o results/simulation/test/simulated --fastq -t 8

Which returns the following error:

Traceback (most recent call last):
  File "/datadrive/metagenomic-simulation/.snakemake/conda/1ee5636240868e58cb37ca975d706f
10_/bin/simulator.py", line 2431, in <module>
    main()
  File "/datadrive/metagenomic-simulation/.snakemake/conda/1ee5636240868e58cb37ca975d706f
10_/bin/simulator.py", line 2390, in main
    read_profile(genome_list, [], model_prefix, perfect, args.mode, strandness, dna_type=
dna_type_list, abun=abun,
  File "/datadrive/metagenomic-simulation/.snakemake/conda/1ee5636240868e58cb37ca975d706f
10_/bin/simulator.py", line 582, in read_profile
    with open(model_prefix + "_base_qualities_model_parameters.tsv") as base_quality_params:
FileNotFoundError: [Errno 2] No such file or directory: 'results/models/sample/training_base_qualities_model_parameters.tsv'

Looking in the folder returned by the characterization stage, I am not finding the expected file with suffix _base_qualities_model_parameters.tsv:

ls -1 results/models/sample/
training_aligned_reads.pkl
training_aligned_region.pkl
training_chimeric_info
training_del.hist
training_error_markov_model
training_error_rate.tsv
training_first_match.hist
training_gap_length.pkl
training_genome_alnm.bam
training_head.txt
training_ht.txt
training_ht_length.pkl
training_ht_ratio.pkl
training_ins.hist
training_match.hist
training_match_markov_model
training_middle.txt
training_middle_ref.txt
training_mis.hist
training_model_profile
training_primary.bam
training_processed.fasta.gz
training_quantification.tsv
training_ratio.txt
training_reads_alignment_rate
training_strandness_rate
training_tail.txt
training_total.txt
training_unaligned_length.pkl
ezherman commented 1 month ago

I had forgotten to include --fastq in the characterisation stage. Adding this option solved the above.