Closed rbturnbull closed 2 years ago
@rbturnbull I took a look at the error message for the iqtree test on cicd
. Part of the problem is that the command
poetry run teiphy example/ubs_ephesians.xml example.nexus --states-present
doesn't collapse any trivial readings, so we end up hitting the maximum of 64 symbols pretty quickly. To prevent this, the command should probably look more like
poetry run teiphy -t"reconstructed" -m"overlap" -m"lac" --states-present example/ubs_ephesians.xml example.nexus
or
poetry run teiphy -t"reconstructed" -t"defective" -t"orthographic" -m"overlap" -m"lac" --states-present example/ubs_ephesians.xml example.nexus
But the real problem that is causing IQTREE to complain is that some of the symbols are capital versions of earlier symbols, and IQTREE treats these as case-insensitive. This is partly in compliance with the guidelines in Maddison, Swofford, and Maddison's paper on NEXUS, which notes that the base symbols in the SYMBOLS block are case insensitive. But the paper also explicitly states that symbols in the EQUATE block are case-sensitive, so an EQUATE symbol of 'A' should be valid even if 'a' is a base symbol. IQTREE's parser does not seem to make this distinction. I'm not sure if it's worth writing up an issue in their repo about this. In any event, this does mean that I should fix the code to use only 0-9 and a-z for base symbols, since 'a' and 'A' should not both be base symbols. If I set aside A-Z and a few extra ASCII symbols, then we should have a clean set of 32 base symbols and 32 EQUATE symbols.
Unfortunately, IQTREE seems to have a problem with any characters outside of 0-9 and a-z (up to capitalization of the latter set), including characters in the standard ASCII set, so even this will not work if we have too many EQUATE symbols. But since IQTREE just renders all ambiguities as missing characters anyway (see my recent comments on #4), it would probably be fine if I use other ASCII characters to pad out the EQUATE symbols set, add an option for the app to treat all ambiguous states as missing, and note in the documentation that outputs intended for IQTREE must be generated using this option.
i think the NEXUS standard is usually only loosely followed by the different phylogenetic software packages. The more flexible teiphy can be the better. I think an option to treat ambiguous states as missing is a good idea. Some packages might also will expect them with curly brackets in the data matrix as well instead of with an EQUATE symbol
All right, in my latest push on cicd
, I've commented out the code that generates the EQUATE symbols and writes the EQUATE block to NEXUS. Instead, I just write the ambiguous states with curly braces like you mentioned, as this requires fewer total symbols. The system test set up in iqtree.yaml
now works on that branch, but I'll wait to merge that branch into main
until we've implemented system tests for stemma
and mrbayes
.
@rbturnbull Apologies for the long line of identical commits, but we now have a passing system test for mrbayes
in place! That just leaves stemma
, whose system test is currently failing at the install step. Here's the script I have at the moment:
- run: poetry install
- name: Testing
run: |
poetry run teiphy -t reconstructed -t defective -t orthographic -m overlap -m lac -s"*" -s T --fill-correctors --format stemma example/ubs_ephesians.xml stemma_example
- name: Install Phylogenetics Package
run: |
git clone https://github.com/stemmatic/stemma.git
cd stemma
make
- name: Phylogenetics Run
run: |
stemma stemma_example
And here's the error I'm getting during installation:
cc -g -Wall -O3 -c -o stemma.o stemma.c
cc -g -Wall -O3 -c -o taxon.o taxon.c
cc -g -Wall -O3 -c -o net.o net.c
cc -g -Wall -O3 -c -o netmsg.o netmsg.c
cc -g -Wall -O3 -c -o hyp.o hyp.c
cc -g -Wall -O3 -c -o heur.o heur.c
cc -g -Wall -O3 -c -o cache.o cache.c
cc -g -Wall -O3 -c -o log.o log.c
cc -g -Wall -O3 -c -o boot.o boot.c
cc -g -Wall -O3 -c -o state.o state.c
cc stemma.o taxon.o net.o netmsg.o hyp.o heur.o cache.o log.o boot.o state.o -lm -o stemma
ln -f stemma /home/runner/bin/
ln: target '/home/runner/bin/' is not a directory: No such file or directory
make: *** [Makefile:25: /home/runner/bin/stemma] Error 1
Error: Process completed with exit code 2.
There seems to be an issue with the location of the bin
directory in the Docker container. Any idea what could be causing this? And @stemmatic, does the script above look like it's doing the right thing, or are there additional steps I need to include?
@rbturnbull I'm afraid I'm going to need your help debugging the stemma.yml
workflow. Both of the Makefiles for prep
and stemma
(the two tools we need to run stemma
on the generated file) set some environment variables with the line
BIN=$(HOME)/bin
So it's possible that this is related to the target '/home/runner/bin/' is not a directory: No such file or directory
error message that the runner throws when I try to make
each project, but I haven't been able to fix things by exporting different paths for HOME
, so I'm not sure what else to do.
Once we have this system working, I think we should be all set with the code. Once we have the formatted stemma images added to the paper, we should be ready to submit it to JOSS.
So it's possible that this is related to the
target '/home/runner/bin/' is not a directory: No such file or directory
error message that the runner throws when I try tomake
each project, but I haven't been able to fix things by exporting different paths forHOME
, so I'm not sure what else to do.
Yeah, the Makefile is trying to hard link the executable file into one’s personal bin directory. Either create the director or comment out that line in the Makefile are two easy work-arounds.
@stemmatic Thanks! It looks like creating the /home/bin
directory completes the installation:
sudo mkdir /root/bin
git clone https://github.com/stemmatic/prep.git
cd prep
sudo make
cd ..
git clone https://github.com/stemmatic/stemma.git
cd stemma
sudo make
It looks like prep
processed the file correctly, but stemma
threw the following error:
7: < ( 7) Variant mismatch: 0 (1) should have exactly 0
/root/bin/prep stemma_example
@ B10K1V1U24-26 [ εν.εφεσω – ]
Fatal error, terminating ...
Parallels=1; MSS=38; VarUnits=0; Pieces=40; Sets=105
@rbturnbull All right, with @stemmatic's help, we now have all system tests running successfully! Here's the output of running STEMMA with simulated annealing for 100 iterations:
And with that, I'll close this issue and merge cicd
back into main
.
I'll set up a github action workflow to run the output of teiphy with iqtree. @jjmccollum - you can use that as an example for other programs like stemma and MrBayes.