andy-z / ged4py

GEDCOM tools for Python
MIT License
18 stars 7 forks source link

Add test with few example GEDCOM files #13

Closed andy-z closed 4 years ago

andy-z commented 4 years ago

Current bunch of unit tests is limited to data that is encode as strings in the Python code. It would be useful to add a bunch of sample GEDCOM files and a bunch of tests that read and parse those files.

andy-z commented 4 years ago

https://www.gedcom.org/samples.html has few samples but they are marked as GEDCOM 5.5.5 and they are relatively small.

http://heiner-eichmann.de/gedcom/gedcom.htm may have one or two samples, they look like they are from 1998.

http://www.geditcom.com/gedcom.html has few "torture" files.

https://gedcomlibrary.com/gedcoms.html has a bunch of large files uploaded by public.

https://chronoplexsoftware.com/myfamilytree/samples/ has couple of interesting samples.

https://webtreeprint.com/tp_famous_gedcoms.php

andy-z commented 4 years ago

One thing that I realized is that I do not want trees of real people to appear in a public git repo, need to think of a way to structure things so that I could have non-published collection (or private collection) of files that could be tested separately.

Tuisto59 commented 4 years ago

Hi Andy ! Using the demo files of the genealogy software to test gedcompy is a good idea.

A list of the most used familly tree software can be make - lot of them can be obselete and impossible to recover (here a list from the french page wikipedia https://fr.wikipedia.org/wiki/Logiciel_de_g%C3%A9n%C3%A9alogie).

This allows in particular to be able to test the parser with the different file encodings available to the software when it exports data to GEDCOM, too, if you wish to carry out tests, if you do not wish to have that gedcom in your repo, several solutions :

andy-z commented 4 years ago

After some thinking I decided that I want a separate private repo for the data files and tests that run on those files, main reasons are:

The new repo is called ged4py_testdata but it is my private repo and I don't plan to give anyone access to it but I will use it as a development tool for testing ged4py. I'll collect some reasonable set of files there and add tests for basic functionality and maybe some specific features as I go along.

andy-z commented 4 years ago

Closing this issue, I will continue adding stuff to my private ged4py_testdata repo as I go, that will probably trigger more tickets here.