hillerlab / GenomeAlignmentTools

Tools for improving the sensitivity and specificity of genome alignments
MIT License
56 stars 15 forks source link

RepeatFiller.py output .chain file reports error when used as input in other chain file programs #21

Open John-Neddermeyer opened 1 year ago

John-Neddermeyer commented 1 year ago

Hi Dr. Hiller,

I am working on generating a pairwise whole genome alignment. I used the command below to run RepeatFiller on a chain file that was merged from several smaller chains and sorted, with chains generated using axtChain. RepeatFiller runs to completion, but when I use the .chain file output from RepeatFiller as input to subsequent programs I get the following error:

"q end mismatch 45149542 vs 45240145 line 565956 of Zonotrichia_albicollis_repeatfiller.chain"

RepeatFiller.py --chain Zonotrichia_albicollis.chain --T2bit ../galGal6.2bit --Q2bit ../Zonotrichia_albicollis_GCF_000385455.1_genomic_simple_filtered_masked.2bit -o Zonotrichia_albicollis_repeat_filler.chain

I do not get the same error message when I use the original .chain file or the output .chain file from patchChain.perl as input into other chain file manipulation programs. I'm wondering if RepeatFiller is not running to completion. I've gone to the specified line in the error message, but I'm not seeing anything out of the ordinary.

Thank you for your time. John

MichaelHiller commented 1 year ago

Thanks for reporting that. Sounds like something with the coordinates went wrong. I'll ask Katya to have a look.

In the meantime, could you pls try to compute the chains using https://github.com/hillerlab/make_lastz_chains which automates all steps, including lastz, axtChain, RepeatFiller and chainCleaner. Pls send a ping if this throws the same error.

2) We have chains between these 2 assemblies already. Pls download them from https://genome.senckenberg.de/download/ZonotrichiaForJohn/ This allows you to continue, but I would really like to know if make_lastz_chains produces the same error.

John-Neddermeyer commented 1 year ago

Hi Dr. Hiller,

Thank you for the completed chains and nets. I've got the pipeline running now, and will let you know how it ends up. I'm not sure if these issues have been documented, but it seems in the most recent version of python iterable is depreciated and the most recent version of python returns this error "ImportError: cannot import name 'Iterable' from 'collections'". In order to run the pipeline I needed to change python versions to earlier than 3.10. Similarly, the most recent version of Nextflow is incompatible with the current pipeline formulation, and a Nextflow version 2.10.x or lower needs to be installed to run the pipeline.

Thank you for your time. John

MichaelHiller commented 1 year ago

I asked Katya and Bogdan to have a look. I know Bogdan fixed something regarding Nextflow recently in TOGA, because the new version is not backwards compatible.

osipovarev commented 1 year ago

Hi John, Regarding the issue with the RepeatFiller chains, could you please share the input data (the chains RepeatFiller takes and fails to produce the correct output), so I could reproduce the issue? Also, I'd suggest checking if you're using the latest version of RepeatFiller: there was a recent fix in the code regarding coordinates. Thanks!

shuifeng1988 commented 6 months ago

Hi Dr. Hiller,

Thank you for the completed chains and nets. I've got the pipeline running now, and will let you know how it ends up. I'm not sure if these issues have been documented, but it seems in the most recent version of python iterable is depreciated and the most recent version of python returns this error "ImportError: cannot import name 'Iterable' from 'collections'". In order to run the pipeline I needed to change python versions to earlier than 3.10. Similarly, the most recent version of Nextflow is incompatible with the current pipeline formulation, and a Nextflow version 2.10.x or lower needs to be installed to run the pipeline.

Thank you for your time. John

How you solved this problem? I used python 3.8.15 and python 3.7, The error was still existed!

MichaelHiller commented 6 months ago

I have asked @osipovarev to have a look.