czbiohub-sf / MIDAS

Metagenomic Intra-Species Diversity Analysis (MIDAS)
MIT License
35 stars 10 forks source link

merge_snp issue still exist #103

Closed BaylorLyu closed 1 year ago

BaylorLyu commented 2 years ago

Hi,

From your command line history midas2_output/LRDYA/snps/145629.snps.tsv.lz4, can you make sure the the sample name provided to the merge_snps command --samples_list list_of_samples.tsv is LRDYA?

Thanks, Chunyu

Originally posted by @zhaoc1 in https://github.com/czbiohub/MIDAS2/issues/88#issuecomment-1150724995

i reference this issue but this kind of bug still exist.

command: midas2 merge_snps --samples_list list_of_samples.tsv --midasdb_name uhgg --midasdb_dir ~/database/midas2/ midas2/ --debug

issue: 1661742862.8: Across samples population SNV calling in subcommand merge_snps with args 1661742862.8: { 1661742862.8: "subcommand": "merge_snps", 1661742862.8: "force": false, 1661742862.8: "debug": true, 1661742862.8: "zzz_worker_mode": false, 1661742862.8: "batch_branch": "master", 1661742862.8: "batch_memory": 378880, 1661742862.8: "batch_vcpus": 48, 1661742862.8: "batch_queue": "pairani", 1661742862.8: "batch_ecr_image": "pairani:latest", 1661742862.8: "midas_outdir": "midas2/", 1661742862.8: "samples_list": "list_of_samples.tsv", 1661742862.8: "midasdb_name": "uhgg", 1661742862.8: "midasdb_dir": "/home/lbl/database/midas2/", 1661742862.8: "species_list": null, 1661742862.8: "genome_depth": 5.0, 1661742862.8: "genome_coverage": 0.4, 1661742862.8: "sample_counts": 2, 1661742862.8: "site_depth": 5, 1661742862.8: "site_ratio": 3.0, 1661742862.8: "site_prev": 0.9, 1661742862.8: "snv_type": "common", 1661742862.8: "snp_pooled_method": "prevalence", 1661742862.8: "snp_maf": 0.05, 1661742862.8: "snp_type": "bi, tri, quad", 1661742862.8: "locus_type": "any", 1661742862.8: "num_cores": 16, 1661742862.8: "chunk_size": 1000000, 1661742862.8: "advanced": false, 1661742862.8: "robust_chunk": false 1661742862.8: } 1661742863.7: 248 species pass the filter 1661742863.7: Create OUTPUT directory. 1661742863.7: 'rm -rf midas2/snps' 1661742863.7: 'mkdir -p midas2/snps' 1661742863.7: Create TEMP directory. 1661742863.7: 'rm -rf midas2/temp/snps' 1661742863.7: 'mkdir -p midas2/temp/snps' 1661742870.0: MIDAS2::write_species_summary::start 1661742870.0: MIDAS2::write_species_summary::finish 1661742870.6: MIDAS2::design_chunks::start Traceback (most recent call last): File "/home/lbl/miniconda3/envs/midas2.0/bin/midas2", line 8, in sys.exit(main()) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/main.py", line 24, in main return subcommand_main(subcommand_args) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 664, in main merge_snps(args) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 658, in merge_snps raise error File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 639, in merge_snps arguments_list = design_chunks(species_ids_of_interest, midas_db) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 220, in design_chunks all_site_chunks = multithreading_map(design_chunks_per_species, [(sp, midas_db) for sp in dict_of_species.values()], num_cores) #<--- File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/common/utils.py", line 540, in multithreading_map return _multi_map(func, items, num_threads, ThreadPool) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/common/utils.py", line 520, in _multi_map return p.map(func, items, chunksize=1) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/multiprocessing/pool.py", line 268, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, *kwds)) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar return list(map(args)) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 205, in design_chunks_per_species return sp.compute_snps_chunks(midas_db, chunk_size, "merge") File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/models/species.py", line 84, in compute_snps_chunks chunks_of_sites = load_chunks_cache(local_file) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/site-packages/midas2/models/species.py", line 181, in load_chunks_cache chunks_dict = json.load(stream) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/json/init.py", line 296, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/json/init.py", line 348, in loads return _default_decoder.decode(s) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/home/lbl/miniconda3/envs/midas2.0/lib/python3.7/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) how to fix it?

zhaoc1 commented 2 years ago

Hi,

The error message suggested the error happened during load_chunks_cache() step. Can you make sure the provided local copy of MIDASDB ~/database/midas2/ has been properly downloaded? Please refer to https://midas2.readthedocs.io/en/latest/download_midasdb.html for downloading MIDASDB.

Thanks. Chunyu

BaylorLyu commented 2 years ago

i use right chunks,but issue still happen. And i am sure MIDASDB has been properly downloaded command: midas2 merge_snps --samples_list list_of_samples.tsv --midasdb_name uhgg --midasdb_dir ~/database/midas2/ midas2/ --chunk_size 500000 --sample_counts 40

issue:

1661822975.2: Across samples population SNV calling in subcommand merge_snps with args 1661822975.2: { 1661822975.2: "subcommand": "merge_snps", 1661822975.2: "force": false, 1661822975.2: "debug": false, 1661822975.2: "zzz_worker_mode": false, 1661822975.2: "batch_branch": "master", 1661822975.2: "batch_memory": 378880, 1661822975.2: "batch_vcpus": 48, 1661822975.2: "batch_queue": "pairani", 1661822975.2: "batch_ecr_image": "pairani:latest", 1661822975.2: "midas_outdir": "midas2/", 1661822975.2: "samples_list": "list_of_samples.tsv", 1661822975.2: "midasdb_name": "uhgg", 1661822975.2: "midasdb_dir": "/home/lbl/database/midas2/", 1661822975.2: "species_list": null, 1661822975.2: "genome_depth": 5.0, 1661822975.2: "genome_coverage": 0.4, 1661822975.2: "sample_counts": 40, 1661822975.2: "site_depth": 5, 1661822975.2: "site_ratio": 3.0, 1661822975.2: "site_prev": 0.9, 1661822975.2: "snv_type": "common", 1661822975.2: "snp_pooled_method": "prevalence", 1661822975.2: "snp_maf": 0.05, 1661822975.2: "snp_type": "bi, tri, quad", 1661822975.2: "locus_type": [ 1661822975.2: "CDS" 1661822975.2: ], 1661822975.2: "num_cores": 16, 1661822975.2: "chunk_size": 500000, 1661822975.2: "advanced": false, 1661822975.2: "robust_chunk": false 1661822975.2: } 1661822976.1: 3 species pass the filter 1661822976.1: Create OUTPUT directory. 1661822976.1: 'rm -rf midas2/snps' 1661822976.1: 'mkdir -p midas2/snps' 1661822976.1: Create TEMP directory. 1661822976.1: 'rm -rf midas2/temp/snps' 1661822976.1: 'mkdir -p midas2/temp/snps' 1661822976.2: MIDAS2::write_species_summary::start 1661822976.2: MIDAS2::write_species_summary::finish 1661822976.8: MIDAS2::design_chunks::start 1661822976.9: ================= Total number of compute chunks: 24 1661822976.9: MIDAS2::design_chunks::finish 1661822976.9: MIDAS2::multiprocessing_map::start 1661822977.0: MIDAS2::process::102478-0::start snps_worker 1661822977.0: MIDAS2::chunk_worker::102478-0::start accumulate_samples 1661822977.0: MIDAS2::process::102478-1::start snps_worker 1661822977.0: MIDAS2::chunk_worker::102478-1::start accumulate_samples 1661822977.0: MIDAS2::process::102478-2::start snps_worker 1661822977.0: MIDAS2::chunk_worker::102478-2::start accumulate_samples 1661822977.0: MIDAS2::process::102478-3::start snps_worker 1661822977.0: MIDAS2::chunk_worker::102478-3::start accumulate_samples 1661822977.0: MIDAS2::process::102478-4::start snps_worker 1661822977.0: MIDAS2::chunk_worker::102478-4::start accumulate_samples 1661822977.0: MIDAS2::process::102478-5::start snps_worker 1661822977.0: MIDAS2::chunk_worker::102478-5::start accumulate_samples 1661822977.0: MIDAS2::process::102478-6::start snps_worker 1661822977.0: MIDAS2::chunk_worker::102478-6::start accumulate_samples 1661822977.0: MIDAS2::process::102478-7::start snps_worker 1661822977.0: MIDAS2::chunk_worker::102478-7::start accumulate_samples 1661822977.0: MIDAS2::process::102478-8::start snps_worker 1661822977.0: MIDAS2::chunk_worker::102478-8::start accumulate_samples 1661822977.0: MIDAS2::process::102478-9::start snps_worker 1661822977.0: MIDAS2::chunk_worker::102478-9::start accumulate_samples 1661822977.0: MIDAS2::process::101346-0::start snps_worker 1661822977.0: MIDAS2::chunk_worker::101346-0::start accumulate_samples 1661822977.0: MIDAS2::process::101346-1::start snps_worker 1661822977.0: MIDAS2::chunk_worker::101346-1::start accumulate_samples 1661822977.0: MIDAS2::process::101346-2::start snps_worker 1661822977.0: MIDAS2::chunk_worker::101346-2::start accumulate_samples 1661822977.0: MIDAS2::process::101346-3::start snps_worker 1661822977.0: MIDAS2::chunk_worker::101346-3::start accumulate_samples 1661822977.0: MIDAS2::process::101346-4::start snps_worker 1661822977.0: MIDAS2::chunk_worker::101346-4::start accumulate_samples 1661822977.0: MIDAS2::process::101346-5::start snps_worker 1661822977.0: MIDAS2::chunk_worker::101346-5::start accumulate_samples 1661822977.0: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102478.snps.tsv.lz4. 1661822977.0: MIDAS2::process::101346-6::start snps_worker 1661822977.0: MIDAS2::chunk_worker::101346-6::start accumulate_samples 1661822977.0: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/101346.snps.tsv.lz4. 1661822977.0: MIDAS2::process::101346-7::start snps_worker 1661822977.0: MIDAS2::chunk_worker::101346-7::start accumulate_samples 1661822977.1: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/101346.snps.tsv.lz4. 1661822977.1: MIDAS2::process::102492-0::start snps_worker 1661822977.1: MIDAS2::chunk_worker::102492-0::start accumulate_samples 1661822977.1: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102478.snps.tsv.lz4. 1661822977.1: MIDAS2::process::102492-1::start snps_worker 1661822977.1: MIDAS2::chunk_worker::102492-1::start accumulate_samples 1661822977.1: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102492.snps.tsv.lz4. 1661822977.1: MIDAS2::process::102492-2::start snps_worker 1661822977.1: MIDAS2::chunk_worker::102492-2::start accumulate_samples 1661822977.2: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/101346.snps.tsv.lz4. 1661822977.2: MIDAS2::process::102492-3::start snps_worker 1661822977.2: MIDAS2::chunk_worker::102492-3::start accumulate_samples 1661822977.2: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102478.snps.tsv.lz4. 1661822977.2: MIDAS2::process::102492-4::start snps_worker 1661822977.2: MIDAS2::chunk_worker::102492-4::start accumulate_samples 1661822977.3: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/101346.snps.tsv.lz4. 1661822977.3: MIDAS2::process::102492-5::start snps_worker 1661822977.3: MIDAS2::chunk_worker::102492-5::start accumulate_samples 1661822977.3: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102492.snps.tsv.lz4. 1661822977.3: MIDAS2::process::102478--1::wait collect_chunks 1661822977.3: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102478.snps.tsv.lz4. 1661822977.3: MIDAS2::process::101346--1::wait collect_chunks 1661822977.4: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/101346.snps.tsv.lz4. 1661822977.4: MIDAS2::process::102492--1::wait collect_chunks 1661822977.4: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/101346.snps.tsv.lz4. 1661822977.4: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102492.snps.tsv.lz4. 1661822977.4: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102478.snps.tsv.lz4. 1661822977.5: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/101346.snps.tsv.lz4. 1661822977.5: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/101346.snps.tsv.lz4. 1661822977.5: MIDAS2::process::101346--1::start collect_chunks cat: midas2/temp/snps/101346/cid.0_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/101346/cid.1_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/101346/cid.2_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/101346/cid.3_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/101346/cid.4_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/101346/cid.5_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/101346/cid.6_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/101346/cid.7_snps_info.tsv.lz4: No such file or directory 1661822977.6: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102478.snps.tsv.lz4. 1661822977.6: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102492.snps.tsv.lz4. 1661822977.7: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102478.snps.tsv.lz4. 1661822977.8: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102492.snps.tsv.lz4. 1661822977.8: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102478.snps.tsv.lz4. 1661822977.9: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102478.snps.tsv.lz4. 1661822977.9: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102492.snps.tsv.lz4. 1661822977.9: MIDAS2::process::102492--1::start collect_chunks cat: midas2/temp/snps/102492/cid.0_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102492/cid.1_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102492/cid.2_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102492/cid.3_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102492/cid.4_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102492/cid.5_snps_info.tsv.lz4: No such file or directory 1661822978.0: WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102478.snps.tsv.lz4. 1661822978.0: MIDAS2::process::102478--1::start collect_chunks cat: midas2/temp/snps/102478/cid.0_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102478/cid.1_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102478/cid.2_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102478/cid.3_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102478/cid.4_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102478/cid.5_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102478/cid.6_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102478/cid.7_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102478/cid.8_snps_info.tsv.lz4: No such file or directory cat: midas2/temp/snps/102478/cid.9_snps_info.tsv.lz4: No such file or directory 1661822978.0: Bugs in the codes, keep the outputs for debugging purpose. multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, *kwds)) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar return list(map(args)) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 270, in process snps_worker(species_id, chunk_id) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 293, in snps_worker chunk_worker(chunks_of_sites[chunk_id][0]) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 345, in chunk_worker accumulate(accumulator, proc_args) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 378, in accumulate for row in select_from_tsv(stream, schema=curr_schema, selected_columns=snps_pileup_basic_schema, result_structure=dict): File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/common/utils.py", line 392, in select_from_tsv assert False, f"Line {i + j} has {len(values)} columns; was expecting {len(headers)}." AssertionError: Line 0 has 13 columns; was expecting 8. """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/lbl/miniconda3/envs/midas2/bin/midas2", line 8, in sys.exit(main()) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/main.py", line 24, in main return subcommand_main(subcommand_args) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 664, in main merge_snps(args) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 653, in merge_snps raise error File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/merge_snps.py", line 643, in merge_snps proc_flags = multiprocessing_map(process, arguments_list, args.num_cores) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/common/utils.py", line 532, in multiprocessing_map return _multi_map(func, items, num_procs, multiprocessing.Pool) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/common/utils.py", line 520, in _multi_map return p.map(func, items, chunksize=1) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/multiprocessing/pool.py", line 268, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value AssertionError: Line 0 has 13 columns; was expecting 8.

zhaoc1 commented 2 years ago

Hi,

I can see a different error message was reported here.

WARNING: Non-zero exit code 141 from reader of midas2/HC1/snps/102478.snps.tsv.lz4. AND AssertionError: Line 0 has 13 columns; was expecting 8.

This seems like for single sample SNP (midas run_snps), you passed the --advanced , can you try to pass the --advanced in your merge_snps command? The basic SNP output files have 8 columns while the advanced SNP output files have 13 columns.

Chunyu

BaylorLyu commented 2 years ago

Thanks ,this issue is already fix. Now i have another issue. I use pre-build pangenome to call snp.

Here is my command: midas2 run_snps --sample_name HC1 \ -1 ~/HCCMicrobiome/HCC/HC1_R1.fastq.gz \ -2 ~/HCCMicrobiome/HCC/HC1_R2.fastq.gz --num_cores 60 \ --chunk_size 500000 \ --prebuilt_bowtie2_indexes midas2/bt2_indexes/pangenomes \ --prebuilt_bowtie2_species midas2/bt2_indexes/pangenomes.species \ --midasdb_name uhgg --midasdb_dir ~/database/midas2/ ~/HCCMicrobiome/midas2/ --debug

And issue:

multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, *kwds)) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar return list(map(args)) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/run_snps.py", line 469, in filter_bam reads_stats = filter_bam_by_single_read(species_id, repgenome_bamfile, filtered_bamfile) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/run_snps.py", line 309, in filter_bam_by_single_read for aln in infile.fetch(contig_id): File "pysam/libcalignmentfile.pyx", line 1091, in pysam.libcalignmentfile.AlignmentFile.fetch File "pysam/libchtslib.pyx", line 685, in pysam.libchtslib.HTSFile.parse_region ValueError: invalid contig gnl|Prokka|UHGG239728_1 """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/lbl/miniconda3/envs/midas2/bin/midas2", line 8, in sys.exit(main()) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/main.py", line 24, in main return subcommand_main(subcommand_args) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/run_snps.py", line 861, in main run_snps(args) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/run_snps.py", line 855, in run_snps raise error File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/subcommands/run_snps.py", line 827, in run_snps list_of_contig_aln_stats = multiprocessing_map(filter_bam, args_list, args.num_cores) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/common/utils.py", line 532, in multiprocessing_map return _multi_map(func, items, num_procs, multiprocessing.Pool) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/site-packages/midas2/common/utils.py", line 520, in _multi_map return p.map(func, items, chunksize=1) File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/multiprocessing/pool.py", line 268, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/home/lbl/miniconda3/envs/midas2/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value ValueError: invalid contig gnl|Prokka|UHGG239728_1

zhaoc1 commented 2 years ago

Hmmm, may I ask did you build your pangenome database with midas2 build_bowtie2db command?

Also, can you check if the contig id gnl|Prokka|UHGG239728_1 is present in the midas2/bt2_indexes/pangenomes.fa (or which ever fasta file you used to build the pangenomes)?

BaylorLyu commented 2 years ago

Here is my command for pangeome midas2 build_bowtie2db \ --midasdb_name uhgg \ --midasdb_dir ~/database/midas2/ \ --species_profile midas2/species/species_prevalence.tsv \ --select_by mean_coverage \ --select_threshold 5 --num_cores 60 \ --bt2_indexes_name pangenomes \ --bt2_indexes_dir midas2/bt2_indexes

and i use this pangenome in cnv calling success

midas2 run_genes --sample_name $a \ -1 ~/HCCMicrobiome/HCC/$a.R1.fastq.gz \ -2 ~/HCCMicrobiome/HCC/$a.R2.fastq.gz \ --midasdb_name uhgg --midasdb_dir ~/database/midas2/ \ --select_threshold=-1 --num_cores 60 \ --prebuilt_bowtie2_indexes midas2/bt2_indexes/pangenomes \ --prebuilt_bowtie2_species midas2/bt2_indexes/pangenomes.species \ ~/HCCMicrobiome/midas2/

Should i rebuild this pangenome for snp calling ?

zhaoc1 commented 2 years ago

--bt2_indexes_name pangenomes is for cnv calling and --bt2_indexes_name repgenomes is for snv calling. Does this answer your question?

BaylorLyu commented 2 years ago

Thanks,this help me a lot.

zhaoc1 commented 1 year ago

No problem. I am gonna close this issue. Please let me know if you have more questions.