Closed chun-he-316 closed 7 months ago
You need to use --dupeMode single
to make sure there's at most one row / genome / maf block.
I have used "cactus-hal2maf js_hal2maf evolver27species.hal evolver27species.maf.gz --refGenome xxxx --chunkSize 500000 --dupeMode single", still returns "ERROR: bad integers or strand in MAF (strand must be + for reference sequence) --".Please tell me what can I do?
If you can find a block in evolver27species.maf.gz
where the reference genome xxxx
is on the negative strand then that's definitely a bug in cactus-hal2maf
. If that is the case (please let me know) you can correct it with mafStrander
(included in cactus). But I suspect there's maybe a naming issue between the MAF and tree you are giving to phyloFit or something like that.
Hi, I had the same problem, and I use mafStrander to correct it. But I find the maf file has been corrected is about 10Gb larger than original maf file. Is this normal?
I'm still unsure about how this problem can happen. If someone can share a hal file, cactus-hal2maf command, and block with a reverse reference in the first row, I'd very much like to try to reproduce.
Thank you so much for your reply. My hal file come from this URL:https://cgl.gi.ucsc.edu/data/cactus/241-mammalian-2020v2.hal. And I use halExtract --root fullTreeAnc112 241-mammalian-2020v2.hal 43primates.hal
to extract the hal file of primates. Then I try to use hal2maf to convert the format, but segmentation fault occurred, so I use this command halExtract 43primates.hal 43primates.fixed.hal
to extract again. Then run follow cactus-hal2maf ./jobs 43primates.fixed.hal 43primates.fixed.cactus.maf --refGenome Homo_sapiens --noAncestors --dupeMode single --chunkSize 500000 --filterGapCausingDupes
to generate the maf file. Is there any problems?
Thanks, I guess it'll take a bit but I'm rerunning this now. I'm using cactus v2.8.0 -- which version did you use?
Thank you so much. My cactus version is v2.7.2.
Thanks, I confirm that I can reproduce the problem. Will fix it asap.
Ok.Thank you very much. Could you please to tell me which procedure will make this error?
---- Replied Message ---- | From | Glenn @.> | | Date | 04/03/2024 21:23 | | To | ComparativeGenomicsToolkit/cactus @.> | | Cc | Marh32 @.>, Comment @.> | | Subject | Re: [ComparativeGenomicsToolkit/cactus] ERROR: bad integers or strand in MAF (strand must be + for reference sequence) (Issue #1320) |
Thanks, I confirm that I can reproduce the problem. Will fix it asap.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
Hi,
I ran multiple whole genome alignments using cactus to generate the hal file, and ran the "cactus-hal2maf js_hal2maf evolver27species.hal evolver27species.maf.gz --refGenome xxxx --chunkSize 500000" to transform the hal file. Then when I ran "phyloFit -i MAF evolver27species.maf", I met the the issue" ERROR: bad integers or strand in MAF (strand must be + for reference sequence) --". I do not know the reason.
Can you tell me how to resolve this problem? Thank you.
The best,
Chun