Shamir-Lab / Recycler

This is the codebase for Recycler, described in our manuscript: https://academic.oup.com/bioinformatics/article/33/4/475/2623362, by Roye Rozov, Aya Brown Kav, David Bogumil, Naama Shterzer, Eran Halperin, Itzhak Mizrahi, and Ron Shamir
BSD 3-Clause "New" or "Revised" License
58 stars 7 forks source link

Length of circular contigs #33

Closed Russel88 closed 6 years ago

Russel88 commented 6 years ago

Hey

Just a minor issue.

The length of the resulting circular contigs does not appear to match the length written in the headers. The length in the headers are always a little larger than the length of the contig. The mismatch in lengths appear to always be a multiple of 86, at least in my case.

I use version v0.7 with a fastg from MEGAHIT

Cheers, Jakob

dpellow commented 6 years ago

Hi Jakob - Thanks for your interest in the tool and the information. Is the mismatch 86 or a multiple of 86? What is the k-mer size?

Russel88 commented 6 years ago

It is a multiple of 86, so either 86, 172, 258, etc..

I have a guess of where 86 comes from. The k-mer size is 141, so I set -k 141, the default k-mer size in your code is 55. 141 - 55 = 86

dpellow commented 6 years ago

Hi Jakob - thanks for bringing this bug to our attention. We've just updated the code and this issue should be fixed. Please reopen the issue if you still see this problem after the update.

Thanks!

Russel88 commented 6 years ago

Thank you for the fast reply. Although, now I get this error:

File "build/bdist.linux-x86_64/egg/recyclelib/utils.py", line 289, in get_spades_type_name File "build/bdist.linux-x86_64/egg/recyclelib/utils.py", line 133, in get_seq_from_path TypeError: 'int' object has no attribute '__getitem__'

dpellow commented 6 years ago

OK thanks - should be fixed now.

Russel88 commented 6 years ago

It works now. Thank you!