mikolmogorov / Ragout

Chromosome-level scaffolding using multiple references
Other
146 stars 27 forks source link

Required argument error with hal2mafMP.py #60

Open bernard-kim opened 4 years ago

bernard-kim commented 4 years ago

Hi Mikhail,

I'm trying to get Ragout (2.3) running, however, it seems that HAL tools (2.1) are expecting a required argument that never gets passed to it.

Specifically, at the start of the "Converting HAL to MAF" stage hal2mafMP.py returns:

RuntimeError: --splitBySequence, --refTargets or --refSequence required

The HAL2MAF command line in Ragout/ragout/synteny_backend/hal.py doesn't seem to contain this argument.

mikolmogorov commented 4 years ago

Hi,

Ok, so seems that HAL API have changed at some point.. Could you share a link to the exact version that you have installed?

Thanks, Mikhail

bernard-kim commented 4 years ago

I installed the release-V2.1 version, i.e. downloaded it with: git clone --branch release-V2.1 https://github.com/ComparativeGenomicsToolkit/hal.git

If it helps, things seem to work if I run hal2mafMP.py separately with the --splitBySequence argument, manually concatenate the split MAF files, and then run Ragout in -s maf mode.

head -n2 genome_contig_1.maf > genome.temp for i in $(ls -v *.maf); do tail +3 ${i} >> genome.temp; done rm *.maf && mv genome.temp genome.maf

The scaffolds look good by a very cursory examination (dotplot and BUSCO).

mikolmogorov commented 4 years ago

Ok, I'm glad that it worked this time. I'm following up with the HAL developers on how to better resolve this.

bernard-kim commented 4 years ago

So the initial issue with the required arguments seems to be fixed; however running hal2mafMP.py produces the following error(s): "hal exception caught: Invalid range specified for convertGenome." I've only been able to test this with a couple of HAL files so not sure if this a quirk of my alignment or something more general.

This issue isn't Ragout-specific. It occurs with hal2mafMP.py when trying to go HAL --> MAF without specifying any splitting method; if --splitBySequence is specified (like I mentioned previously) there is no issue.

I'll do more testing to see if I can reproduce this for a completely different alignment.

mikolmogorov commented 4 years ago

@bernard-kim are you using hal2maf version from after this fix: https://github.com/ComparativeGenomicsToolkit/hal/issues/117?

If so, could you please post the full hal2mafMP.py command line and describe your issue in that thread (referenced above)?

stsmall commented 4 years ago

Hi @fenderglass, I am having the same issue as @bernard-kim details above. I followed hal#117 and grabbed the newest commit of hal2mafMP.py. The option errors are fixed but there is still the error with 'Invalid range specified for convertGenome'. This, as stated, is something with hal and not ragout. So it really just effects those trying to come from hal. Just to summarize ... there are two work-arounds, 1) use --splitBySequence and concatenate then run ragout using '-s maf' or 2) edit the hal.py file to use the hal2maf rather hal2mafMP.py. Any other suggestions?

hal2mafMP.py commandline python hal2mafMP.py --numProc 5 ../Lik90.hal alignment.maf --targetGenomes Fun --refGenome Lik90 --noAncestors

thanks, @stsmall

mikolmogorov commented 4 years ago

@stsmall thanks for the feedback. Yes, I think both of your suggestions are valid, I can't think about other alternatives right now.

In fact, it might make sense to switch to single-threaded hal2maf for now. But I also hope that the HAL developers should be able to fix this in the near future.

Mikhail