bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
88 stars 18 forks source link

Multiple genome files reported to contain no sequence #199

Closed drhoads closed 2 years ago

drhoads commented 2 years ago

Versions

Command used and output returned

Describe the bug

Poppunk v 2.4.0 sketchlib v 1.7.3 Running in a WSl2-Ubuntu VM on Win10. I have seen this error before and it often had to do with malformed strain names with space or ( in them or when I forgot to convert my r-list.txt with dos2unix. The same installation and bash file is used bash file:***** poppunk --create-db --output PoppunkDB --r-files r-list.txt --external-clustering external_clusters.csv --qc-filter continue --threads 8 poppunk --fit-model lineage --ref-db PoppunkDB --output PopPunkLineageModel --graph-weights --qc-filter continue --threads 8 poppunk_visualise --ref-db PoppunkDB --model-dir PopPunkLineageModel --output Poppunk_viz --overwrite --microreact --threads 8


All the requisite genomes are first copied into a subfolder 'genomes' and then build the r-list.txt The bash finishes just fine on one set of 44 genomes, but then for a set of 37 (many of them the same) I get errors that multiple files contain no sequence, then a series of traceback calls: Graph-tools OpenMP parallelisation enabled: with 8 threads Mode: Building new database from input sequences Sketching 37 genomes using 8 thread(s) Progress (CPU): 37 / 37 SP88d contains no sequence B3-14B contains no sequence SP82a contains no sequence 1510 contains no sequence ch5 contains no sequence ED98 contains no sequence 1521 contains no sequence 1516 contains no sequence

Traceback (most recent call last): File "/home/dougrhoads/miniconda3/envs/Poppunk/bin/poppunk", line 11, in sys.exit(main()) File "/home/dougrhoads/miniconda3/envs/Poppunk/lib/python3.9/site-packages/PopPUNK/main.py", line 307, in main constructDatabase( File "/home/dougrhoads/miniconda3/envs/Poppunk/lib/python3.9/site-packages/PopPUNK/sketchlib.py", line 407, in constructDatabase pp_sketchlib.constructDatabase(dbname, RuntimeError: Errors during sketching PopPUNK (POPulation Partitioning Using Nucleotide Kmers) (with backend: sketchlib v1.7.3 sketchlib: /home/dougrhoads/miniconda3/envs/Poppunk/lib/python3.9/site-packages/pp_sketchlib.cpython-39-x86_64-linux-gnu.so)


Note that I have used these same genomes before with success. I even tried running dos2unix on all the .fna files r-list.txt .

drhoads commented 2 years ago

Please disregard this issue posting and close this. I missed a couple details. Finally figured it out. I had lost my old notes. I will try to not make that mistake again.

nickjcroucher commented 2 years ago

Thanks for letting us know @drhoads - happens to all of us - please reopen if you want to let us know what the cause was, for the reference of anyone else encountering this error in the future

drhoads commented 2 years ago

Well, there were several issues, and most of them were bone-head moves on my part. the files were properly transferred and the r-list.txt was prepared. Then I forgot that I needed to include the actual PATH to the copied files and switching back and forth from DOS and Linux I used the wrong folder symbol "\" instead of "/". So in REALITY I was looking for the wrong problem (what is wrong with the sequence file) because the program said there was no sequence data in the file, but in reality the program could not FIND the file because I had given an incorrect path.

drhoads commented 2 years ago

I posted a mea culpa reply in Github

@.***

From: nickjcroucher @.> Sent: Sunday, February 27, 2022 1:35 PM To: johnlees/PopPUNK @.> Cc: Douglas Duane Rhoads @.>; Mention @.> Subject: Re: [johnlees/PopPUNK] Multiple genome files reported to contain no sequence (Issue #199)

Thanks for letting us know @drhoadshttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdrhoads&data=04%7C01%7Cdrhoads%40uark.edu%7C9109e3820b284287fe9a08d9fa283c83%7C79c742c4e61c4fa5be89a3cb566a80d1%7C0%7C0%7C637815873570303209%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=NiOT6TAA%2FMd4DawEkoMsPjcuBy0hTtxSxILQpJ61P7M%3D&reserved=0 - happens to all of us - please reopen if you want to let us know what the cause was, for the reference of anyone else encountering this error in the future

- Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjohnlees%2FPopPUNK%2Fissues%2F199%23issuecomment-1053656007&data=04%7C01%7Cdrhoads%40uark.edu%7C9109e3820b284287fe9a08d9fa283c83%7C79c742c4e61c4fa5be89a3cb566a80d1%7C0%7C0%7C637815873570303209%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=656ZGUT3vKVWyA0efv8LDz0pfKwPuU6y7gE9jRVBLuU%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAIX22VVF4OP4CLYMJJXRGW3U5J4GBANCNFSM5PPGLNPQ&data=04%7C01%7Cdrhoads%40uark.edu%7C9109e3820b284287fe9a08d9fa283c83%7C79c742c4e61c4fa5be89a3cb566a80d1%7C0%7C0%7C637815873570303209%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=X4vehARZy0HwET1noYNTT4rf0q5H1nnOFqdMJCzuq3s%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7Cdrhoads%40uark.edu%7C9109e3820b284287fe9a08d9fa283c83%7C79c742c4e61c4fa5be89a3cb566a80d1%7C0%7C0%7C637815873570303209%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=869%2BeBQ%2BEXrPXBvIIMqMc4kAteMLbx6wvBD8q4tVVAk%3D&reserved=0 or Androidhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7Cdrhoads%40uark.edu%7C9109e3820b284287fe9a08d9fa283c83%7C79c742c4e61c4fa5be89a3cb566a80d1%7C0%7C0%7C637815873570303209%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=zlAiRayVlOrcnSk%2FlER4L6YEy5OXmt16OSzXhtYMDyY%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.**@.>>