Closed MarioStanke closed 9 years ago
I will double-check this in a bit, but at first glance I think the problem here may be that the genomes are entirely lowercase, and so the entire genome is considered soft-masked?
Yes - that must be it! I think it would be appropriate to print a warning message to the log file (this would be reported from cactus_setup.c) if an input contig is entirely lower-case.
I will do this today.
On Fri, Mar 7, 2014 at 9:20 AM, Joel Armstrong notifications@github.comwrote:
I will double-check this in a bit, but at first glance I think the problem here may be that the genomes are entirely lowercase, and so the entire genome is considered soft-masked?
Reply to this email directly or view it on GitHubhttps://github.com/glennhickey/progressiveCactus/issues/14#issuecomment-37045897 .
I will retry with upper case sequences on Monday. Thanks.
Yup, a warning or error message would help a lot.
Yes +1, we should fix this.
On Tue, Sep 23, 2014 at 6:20 AM, Michael Paulini notifications@github.com wrote:
Yup, a warning or error message would help a lot.
— Reply to this email directly or view it on GitHub https://github.com/glennhickey/progressiveCactus/issues/14#issuecomment-56517992 .
While my progressiveCactus installation works fine on other genomes, it does not seem to align anything on this input of three bacterial genomes of the same species that are very similar but not identical. Each genome consists of a single sequence with names NC_002952, NC_016941, NC_017331, respectively.
Running
wget http://bioinf.uni-greifswald.de/bioinf/tmp/cactus/test.cactus wget http://bioinf.uni-greifswald.de/bioinf/tmp/cactus/NC_002952.fa wget http://bioinf.uni-greifswald.de/bioinf/tmp/cactus/NC_016941.fa wget http://bioinf.uni-greifswald.de/bioinf/tmp/cactus/NC_017331.fa wget http://bioinf.uni-greifswald.de/bioinf/tmp/cactus/cactus_progressive_config.xml
runProgressiveCactus.sh --configFile=cactus_progressive_config.xml test.cactus cactusout test.hal halStats test.hal
produces
hal v2.1 ((NC_002952:0.1,NC_016941:0.1)Anc1:0.1,NC_017331:0.1)Anc0;
GenomeName, NumChildren, Length, NumSequences, NumTopSegments, NumBottomSegments Anc0, 2, 0, 1, 0, 0 Anc1, 2, 0, 0, 0, 0 NC_002952, 0, 2902619, 1, 1, 0 NC_016941, 0, 2762785, 1, 1, 0 NC_017331, 0, 3043210, 1, 1, 0
It looks like the ancestral sequences are empty and
hal2maf --noAncestors --refGenome NC_002952 test.hal test.maf
does not produce any alignment with more than 1 row.
cactus_progressive_config.xml only differs from the file that comes with the distribution by this parameter
filterByIdentity="0"
It also does not produce an alignment if I omit the tree, or if I use the default config file.