anuradhawick / LRBinner

LRBinner is a long-read binning tool published in WABI 2021 proceedings and AMB.
https://doi.org/10.4230/LIPIcs.WABI.2021.11
GNU General Public License v2.0
29 stars 5 forks source link

Computing Contig Length Error #4

Open cazzlewazzle89 opened 2 years ago

cazzlewazzle89 commented 2 years ago

Hi @anuradhawick

Sorry to keep inundating you with issues. I'm tying to bin contigs that I have assembled from ONT data using metaFlye but I am running into the following error (command used also provided).

Any advice would be greatly appreciated, Calum

LRBinner contigs --reads-path FastQ_Raw/WW_combined_porechopped.fastq --contigs Assemblies/WW_flye_auto/Medaka/consensus.fasta --output LRBinner_Output/WW/

2021-11-19 13:01:31,696 - INFO - Command /home/cwwalsh/Software/LRBinner/LRBinner contigs --reads-path FastQ_Raw/WW_combined_porechopped.fastq --contigs Assemblies/WW_flye_auto/Medaka/consensus.fasta --output LRBinner_Output/WW/
2021-11-19 13:01:31,697 - INFO - Computing contig lengths
Traceback (most recent call last):
  File "/home/cwwalsh/Software/LRBinner/LRBinner", line 197, in <module>
    main()
  File "/home/cwwalsh/Software/LRBinner/LRBinner", line 179, in main
    pipelines.run_contig_binning(args)
  File "/home/cwwalsh/Software/LRBinner/mbcclr_utils/pipelines.py", line 63, in run_contig_binning
    pickle.dump(contig_length, open(f"{output}/profiles/contig_lengths.pkl", "wb+"))
FileNotFoundError: [Errno 2] No such file or directory: 'LRBinner_Output/WW//profiles/contig_lengths.pkl'
anuradhawick commented 2 years ago

Hi @cazzlewazzle89,

Thanks for the issue, much appreciated. I am currently building this part of LRBinner so a straight fix might take some time to come.

I suspect this may have something to do with the output path --output LRBinner_Output/WW/. Could you try to give a directory like --output LRBinner_Output_WW/ which is not nested. I do not recall LRBinner supporting nested paths.

However, I will keep this in mind and make fixes. I have not rigorously tested the contigs option of LRBinner as this was something beyond what I have presented in the paper. Let me know if this fixes the error, if not I will get into this ASAP.

Our research group has another binner MetaCOAG (preprint: https://www.biorxiv.org/content/10.1101/2021.09.10.459728v1) by @Vini2. I have actually tried to use the MetaCOAG idea in the deep learning model with her help. Just letting you know as this might be something worth looking into. MetaCOAG has been well tested on assembled contigs.

keep in touch!

Cheers Anuradha

cazzlewazzle89 commented 2 years ago

Thanks @anuradhawick

That fixed the issue I was having and the software appears to have run successfully. I assume the second column in the bins.txt file contains the (zero-based) binIDs into which to group the contigs?

Thanks for sharing the MetaCOAG software. I don't think it will be useful here as this is a nanopore-only dataset so I don't have short reads with which to calculate coverage. I will keep it in mind for future datasets though.

All the best, Calum

anuradhawick commented 2 years ago

Yes. The bins file will have 0 based bins for each sequence in the input file.

let me know if there’s anything. Or any feedback on results.

Cheers Anuradha.

cazzlewazzle89 commented 2 years ago

Thanks @anuradhawick There are a few contigs omitted from the output (present in the multifasta input but not in the bins.txt output). Is that to be expected?

anuradhawick commented 2 years ago

That is expected in current implementation. Because HDBSCAN output noise points and those are not binned. This is the main reason why I output bins.txt as a file containing seq id and bin id both.

Do you have any feedback on this? Is it worth improving this feature as a part of LRBinner in future?

cazzlewazzle89 commented 2 years ago

That makes perfect sense to me. I assumed they were unbinned/undetermined and would have expected this but just wanted to confirm. Thanks

jianshu93 commented 2 years ago

What could be the reason of this error:

2022-07-25 22:50:09,153 - INFO - Command /storage/home/hcoda1/4/jzhao399/p-ktk3-0/miniconda3/envs/lrbinner/bin/LRBinner contigs --reads-path ../min17.noHost.sup1K.fastq --k-size 4 --threads 12 --output LRBinner --contigs assembly.fasta 2022-07-25 22:50:09,155 - INFO - Computing contig lengths Traceback (most recent call last): File "/storage/home/hcoda1/4/jzhao399/p-ktk3-0/miniconda3/envs/lrbinner/bin/LRBinner", line 197, in main() File "/storage/home/hcoda1/4/jzhao399/p-ktk3-0/miniconda3/envs/lrbinner/bin/LRBinner", line 179, in main pipelines.run_contig_binning(args) File "/storage/coda1/p-ktk3/0/jzhao399/rich_project_bio-konstantinidis/apps/LRBinner/mbcclr_utils/pipelines.py", line 63, in run_contig_binning pickle.dump(contig_length, open(f"{output}/profiles/contig_lengths.pkl", "wb+")) FileNotFoundError: [Errno 2] No such file or directory: 'LRBinner/profiles/contig_lengths.pkl'

I follow exactly the install guide to install and create condo environment.

Thanks,

Jianshu