piratical / Madeline_2.0_PDE

The Madeline 2.0 Pedigree Drawing Engine (PDE) is a pedigree drawing program designed to handle large and complex pedigrees with an emphasis on readability and aesthetics. The program was designed primarily for human pedigrees.
GNU General Public License v2.0
19 stars 18 forks source link

Can't have half-siblings and grandparents at the same time? #28

Closed formerPolaris closed 8 years ago

formerPolaris commented 8 years ago

Hi there! Checking in from a gene diagnostics company. Madeline has been helpful so far.

Hope I'm not being a bother, but I've been having issues with the engine when trying to add both half-siblings and grandparents to a family. Half-siblings are well-supported, but as soon as grandparents are introduced, entire branches of the family will simply stop being drawn.

I'll add tab-delimited data files and the associated drawings to show you what I mean. Please note that the non-data .txt files are the XML drawing files, but GitHub doesn't support them with their native extension:

Working data: data_for_working_ped.txt Working pedigree drawing: working_ped.txt

Data with parents newly added to rightmost father: data_for_bad_ped.txt Broken pedigree drawing: bad_ped.txt

piratical commented 8 years ago

Hi, Vardarac,

Thanks for bringing this to my attention. I'll take a look at this within a day or two. Currently I'm getting a segfault on your "working_ped.txt" data file, so I don't know if you sent me the right file or whether there is a regression in the latest git revision of Madeline. I'll figure out more tomorrow!

formerPolaris commented 8 years ago

I just realized that the post-processing we do on the pedigree XML files is going to mess up display. I'll send you the files generated directly from Madeline prior to that. These drawing files should help clear things up.

Working pedigree XML (all relatives shown before grandparents added): working.txt

Broken (grandparents added; most of the tree gets cut off beyond the branch with the grandparents): broken.txt

If you're still having trouble with the data files let me know; I just did madeline2 <filename> to generate the above drawings so they should be OK.

Thanks much!

piratical commented 8 years ago

Hi, Vardarac,

There is a long-standing known bug in Madeline where "a spouse of a multiply-mated individual cannot themselves have multiple mates." So, in your "data_for_working_ped.txt", individual 226744 is treated as an "original founder" at the top of the descent tree. 226744 has three spouses: 226746, 226743, and 226747. Among these three, one of them, namely 226743, is herself "multiply mated" (with 226745 being her second spouse).

It is interesting that you get a completed drawing on your machine, while on my machine I get some weird messages and a segfault, all of which suggest to me an unititialized variable or memory leak somewhere.

A possible workaround would be to remove 226744's "extra" spouses 226746 and 226743.

I will keep working on this, but this is already known to be an elusive bug and it might take a while to fix.

Best - Ed

piratical commented 8 years ago

By the way, the aforementioned bug/limitation will definitely be eliminated in "Madeline version 3" (M3) which we have started to work on. M3 will address this and a host of other limitations in Madeline 2.0 PDE (M2). However, because M3 will be a nearly complete rewrite of many of the internals of Madeline, I can't give you an estimated time-to-completion on M3 yet.

formerPolaris commented 8 years ago

No problem! I appreciate all the info- we'll steer clear of those edge cases.

If it helps you track down the reason for the segfault, the version of Madeline2 we're using now - the one that drew the pedigrees above - is the build that was released after custom colors for Affected statuses.

piratical commented 8 years ago

Hi, Vardarac, I have some good news!

I completely rewrote the algorithm that determines descent trees. The result is that your "bad" data now is drawn with two descent trees. Although the drawing is still not yet completely correct, this is a vast improvement in the right direction. I should be able to figure out the remaining issues in a few days.

The huge improvement is that it now seems likely that the long-standing bug where Madeline would occasionally completely ignore certain descent trees (especially in the multiply-mated spouses cases) has been overcome.

Still imperfect, but you may want to try it out:

https://github.com/piratical/Madeline_2.0_PDE/commit/cfa997f04ac942cb4a59b34339a1a1ba5e1c7407

Best — Ed

piratical commented 8 years ago

Here's the output where you can see that individual 43 is not being drawn a second time and the curved connector is missing. Some flags are not being properly set, but it should be not too hard to fix:

vbad_001

piratical commented 8 years ago

Tonight's commit just needs some additional testing:

vbad_001_now_good

formerPolaris commented 8 years ago

That looks fantastic, Ed! The fast fix is hugely appreciated. We can create (non-consanguinous) datasets for testing really quickly, so if you would like us to provide sets for specific configurations just let me know. I'll see about getting/if Madeline is set up on our systems through Git.

piratical commented 8 years ago

Hi, Vardarac,

I'm planning to compile a set of test pedigrees that will include your original "working" set, your original "bad" set (vbad_001 in the PNG above), as well as a few additional test cases to stress-test the program. These will get included in the set of tests run on Travis CI.

I'm doing some additional work so that Madeline will situations with missing parents better and there will be some test pedigrees for that too which will also end up on the Travis continuous integration tests.

In reality, pedigrees such as vbad_001 shown in the PNG image are not yet drawn optimally but at least they are drawn correctly: Better optimizations will have to wait until Madeline version 3.

I'll let you know if I need any more test pedigrees. Of course let me know if you find something else that doesn't work. I plan to refine / refactor the current code a bit more in the coming weeks.

I'll close this issue after I have a larger set of test pedigrees that pass.

piratical commented 8 years ago

Hi, Vardarac,

The test pedigrees for this issue are now included in my new testing framework where they are now called descent_tree_004 and descent_tree_005.

If you have not yet seen it yet, the new testing framework includes a script (testing_framework/process.sh) which runs Madeline agains all of the test data, diffs against the previous expected results, and then results are conveniently displayed in a single web document (testing_framework/index.html) with embedded SVGs, so it will be easy to spot regressions going forward.

So, I'm going to close this issue as resolved for now. — Ed