chigc / stemmatology

0 stars 1 forks source link

Testing Adam & Eve Illustration Matrix #2

Open lennartrx opened 7 months ago

lennartrx commented 7 months ago

With Andreas help I also tested my matrix. We wanted to test if applying stemmatology to the images would be possible. My matrix consists of 184 taxas each 48 characters long. The spreadsheet is available here (https://docs.google.com/spreadsheets/d/19lqFXWLqEgoUZrQC_IoGIMFPfZdjddOKfPmorfIft_8/edit#gid=0) and the nexus file in the Stemmatology Illustration folder.

CPU time to create a tree

With 10 taxas 0.01 sec

With 20 taxas 2.22 sec Retained 989 trees

With 25 taxas 1h29

30 taxas 5.4% progress after +2h

With 40 taxas 0.00% progress after 1h of runtime

To see if there was an issue with the data I cut out the outlier mbs1708 000-00---0-------------0000000------------------ and let it run again with 24 taxa

24 taxas Outlier mbs1708 000-00---0-------------0000000------------------ cut out 0.50h

This significantly improved the CPU time. Thus, we have concluded that the taxas should be chosen to have as few “-“ and “?” entries as possible. Still, there seems to be a hard boundary around 30-40 taxas, at least for our computers.

To test this, I randomly generated 30 48-character long taxas only consisting of 0s and 1s twice, which I ended after 2h and 3h run-time. I assume that the data was too random to find relations.

30 clean taxas 0.00% after 2h runtime

30 clean taxas 0.00% after 3h15 runtime