Linekio / getmyancestors

Get GEDCOM files from FamilySearch.org
Other
163 stars 40 forks source link

Feature request: include spouses families #23

Closed FlominatorTM closed 3 years ago

FlominatorTM commented 5 years ago

Hi there,

would it be an idea to include an option, which also retrieves the families of spouses using the same ascending and descending options given for the individual.

I have no own database but maintain everything in FamilySearch, therefor I would like to be able to generate charts etc. from the GEDCOM file by using some genealogy software after download. For this it would be nice to have as much data downloaded as possible.

Best

Flo

roobstan commented 4 years ago

The way to download as much data as possible is this: run getmyancestors for the first time: python3 getmyancestors.py -o out.ged -m -u -p then run the script below which extracts all the FamilySearch IDs from the .ged file and prints them on the screen:

!/usr/bin/env python3

import re ids = [f[-8:] for f in re.findall(r'_FSFTID [^M]{1}.{7}', open("out.ged", "r").read(), re.MULTILINE)] print(' '.join(ids)) print("Total IDs: ", len(ids))

Then feed all those IDs printed on the screen back to getmyancestors, so that it can traverse all those names python3 getmyancestors.py -o out.ged -m -u -p -i

Rinse and repeat until you .ged file stops growing. This means that all possible data has been downloaded.

FlominatorTM commented 4 years ago

Thank you! Interesting approach. Here's the same one using Windows batch:

@ ECHO OFF type %1 | find "FSF" > rows.txt for /f "delims=#" %%d in ('type rows.txt') do ( SETLOCAL EnableDelayedExpansion SET "FSID=%%d" echo !FSID:~10,10! ENDLOCAL )

What worries me a little is that it will then download all data multiple times, which is a little unhandy, since a normal download already takes two hours ...

roobstan commented 4 years ago

I had 600 people in my tree and the whole procedure took ~ 5 mins,

Linekio commented 4 years ago

Be aware that with the -m option, there are _FSFTID for families, not just individuals.

With mergemyancestors.Gedcom:

from getmyancestors import Tree
from mergemyancestors import Gedcom

tree = Tree()

with open("out.ged") as file:
    gedcom = Gedcom(file, tree)
    print(" ".join(indi.fid for indi in gedcom.indi.values()))
FlominatorTM commented 4 years ago

I had 600 people in my tree and the whole procedure took ~ 5 mins,

I'm talking about trees of 15000 and more

FlominatorTM commented 4 years ago

Thanks for this input from the two of you.

I managed to put all together in this script: get_recursive.txt

Some 12h ago family search got terribly slow with two minutes response between the queries.

I also have the perception that I still do a lot of queries multiple times. What do you think about that? Is that the case?

PS: maxLengthIndi I had set to 8191 as I found out on StackOverflow but I changed it to a lower number, because I thought it might speed up execution. It didn't, though.

FlominatorTM commented 4 years ago

I stopped the script after running it for 12 hours leading to a merged GEDCOM file of 295.000 people. I think I will have to withdraw this feature request ;)

sebdu66 commented 4 years ago

Thanks @FlominatorTM for the script. I launch the script 2 minutes ago. I hope it will work. I had to change one things in your script : On line 40 : cmd = cmd + i + ".ged " I change it with cmd = cmd + str(i) + ".ged "

I'm using python 3.8 and it didn't want to concatenate string with integer.

FlominatorTM commented 4 years ago

Good luck with that. Meanwhile I also tried on my wife's side and stopped after an hours because it looked equally hopeless to ever finish.

sebdu66 commented 4 years ago

Thank you. I think the script will stopped once there is no space left on the device ;), it's still running. Everybody is connected at some point.

FlominatorTM commented 4 years ago

So? I think my next approach would be filtering for place names ...

sebdu66 commented 4 years ago

I had to stop it, because it was running on a laptop that I needed to move. I got almost 300 000 names. Filtering by places, could be good.

Linekio commented 3 years ago

I close this issue since it can be managed by multiple ways (using custom code or mergemyancestors).