mikolmogorov / Ragout

Chromosome-level scaffolding using multiple references
Other
149 stars 27 forks source link

ERROR: No sequences read for genome X. #18

Closed nluhmann closed 7 years ago

nluhmann commented 7 years ago

Hello, I am currently trying to run Ragout to scaffold a rather fragmented assembly using several fully assembled references. When computing synteny blocks with Sibelia, no blocks are found for block size 5000 (I do not have contigs of length 5000), but there are some for block size 500 and 100. Subsequently I get the following error:

[21:34:49] root: INFO: Starting Ragout v2.0 [21:34:49] root: INFO: Synteny block scale set to 'small' [21:34:50] root: INFO: Running Sibelia with block size 5000 [21:41:54] root: INFO: Running Sibelia with block size 500 [21:49:34] root: INFO: Running Sibelia with block size 100 [21:58:16] root: INFO: Phylogeny is taken from the recipe [21:58:16] root: INFO: Processing permutation files [21:58:16] root: DEBUG: Reading permutation file [21:58:16] root: ERROR: An error occured while running Ragout: [21:58:16] root: ERROR: No sequences read for genome X. Check recipe for correctness.

I used the recipe already with other contig assemblies, so it should be fine. Can Ragout propose a scaffold only based on the smaller blocks? Or can I set the block sizes to use manually?

Thank you.

kspham commented 7 years ago

What is the assembly size? And what is the sequence similarity?

On Tue, Jan 31, 2017 at 12:41 AM, Nina Luhmann notifications@github.com wrote:

Hello, I am currently trying to run Ragout to scaffold a rather fragmented assembly using several fully assembled references. When computing synteny blocks with Sibelia, no blocks are found for block size 5000 (I do not have contigs of length 5000), but there are some for block size 500 and 100. Subsequently I get the following error:

[21:34:49] root: INFO: Starting Ragout v2.0 [21:34:49] root: INFO: Synteny block scale set to 'small' [21:34:50] root: INFO: Running Sibelia with block size 5000 [21:41:54] root: INFO: Running Sibelia with block size 500 [21:49:34] root: INFO: Running Sibelia with block size 100 [21:58:16] root: INFO: Phylogeny is taken from the recipe [21:58:16] root: INFO: Processing permutation files [21:58:16] root: DEBUG: Reading permutation file [21:58:16] root: ERROR: An error occured while running Ragout: [21:58:16] root: ERROR: No sequences read for genome X. Check recipe for correctness.

I used the recipe already with other contig assemblies, so it should be fine. Can Ragout propose a scaffold only based on the smaller blocks? Or can I set the block sizes to use manually?

Thank you.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/fenderglass/Ragout/issues/18, or mute the thread https://github.com/notifications/unsubscribe-auth/AA83Ce7nJwVKN-Q1F5UVcg7e_Fx-poszks5rXvOegaJpZM4LyXDy .

mikolmogorov commented 7 years ago

Hi,

Yes, if your contigs are not long enough, it is possible to start from smaller block size. You will need to edit 'ragout/shared/config.py' file. Block sizes for "small" setting are listed at line 35, you can start by removing "5000" from this list and check if it works for you.

Best, Mikhail

nluhmann commented 7 years ago

Thanks! That worked so far. It might be nice to have an "advanced" set of parameters in the program to adjust such parameters directly beyond the suggested block size scales.

JPegorino commented 7 months ago

For anyone else recieving this error who winds up here, I thought I'd add my experience that you can get this same error message if one of your genome names contains a dot - e.g. in my case, I was using a reference file named 'GCF_900474695.1.fasta'. I got past the error by renaming to GCF_900474695_1.fasta'

I should add that the local parameters in my recipe file were named directly after the genomes, and therefore the recipe path variable name also had the dot (relevant lines extracted below) - I didn't test if this was specifically why I got the error: #paths to genome fasta files GCF_900474695.1.fasta = GCF_900474695.1.fasta Ragout version = 2.3. Hopefully this might save a future user some head-scratching!