neRok00 / ancestry-image-downloader

A Python script that downloads images from Ancestry.com that are related to records in your family tree.
20 stars 3 forks source link

Cannot download New York Arrivals microfilm images #5

Open alexbird opened 5 years ago

alexbird commented 5 years ago

I don't have a paid account any more so I cannot experiment or send a PR. Just giving some details in case it's useful, as the script was useful to me.

APID: 1,7488::2147483647

Script cannot get the record page for this. Browser shows generic not found page for search.ancestry.com URL.

Processing APID 499 of 977 <APID 1,7488::2147483647>...
    > Getting the record page for the APID...
    > There was an error when trying to get the record page for the APID.
    > Aborted!

My GEDCOM file from Ancestry contains many (40) references to this APID, but they do not all correspond to the same record PAGE. There are actually 11 different PAGEs, which I found in the GEDCOM file, and manually downloaded, though this is very tedious work!

Three examples:

2 SOUR @S139266984@
3 PAGE Year: 1909; Arrival: New York, New York; Microfilm Serial: T715, 1897-1957; Microfilm Roll: Roll 1306; Line: 3; Page Number: 12
3 _APID 1,7488::2147483647
2 SOUR @S139266984@
3 PAGE Year: 1915; Arrival: New York, New York; Microfilm Serial: T715, 1897-1957; Microfilm Roll: Roll 2402; Line: 7; Page Number: 13
3 _APID 1,7488::2147483647
2 SOUR @S139266984@
3 PAGE Year: 1948; Arrival: New York, New York; Microfilm Serial: T715, 1897-1957; Microfilm Roll: Roll 7536; Line: 8; Page Number: 262
3 _APID 1,7488::2147483647

I've changed the line numbers to anonymise these a bit

Again, thanks for the script!

neRok00 commented 5 years ago

This is a strange problem. The 3rd number in the APID is the record ID within the database, and the record is what sets the page and volume numbers etc within the database. So for it to be returning multiple different sets of information is not normal.

If you visit http://search.ancestry.com/cgi-bin/sse.dll?indiv=1&dbid=7488&h=2147483647, which record do you actually see? You should be able to visit that page without a subscription (maybe only if the record is saved in your tree).

alexbird commented 5 years ago

I just get this /security/deny.aspx 🚨 302: N.B. I am logged into my Ancestry free account and have it open in other tabs

<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="https://www.ancestry.com/security/deny.aspx?sub=281479271972864&amp;dbid=7488&amp;url=https%3a%2f%2fsearch.ancestry.com%2fcgi-bin%2fsse.dll%3findiv%3d1%26dbid%3d7488%26h%3d2147483647%26requr%3d281479271972864%26ur%3d0&amp;gsfn=&amp;gsln=&amp;h=2147483647">here</a>.</h2>
</body></html>
<!-- SN:I-0833714AB7BF8 -->

Which leads to a 301:

<head><title>Document Moved</title></head>
<body><h1>Object Moved</h1>This document may be found <a HREF="http://www.ancestry.com/cs/offers/join?sub=281479271972864&amp;dbid=7488&amp;url=https%3a%2f%2fsearch.ancestry.com%2fcgi-bin%2fsse.dll%3findiv%3d1%26dbid%3d7488%26h%3d2147483647%26requr%3d281479271972864%26ur%3d0&amp;gsfn=&amp;gsln=&amp;h=2147483647">here</a></body>

Which loads an Ancestry.com signup page "See what discoveries you can make today.", which redirects almost immediately in javascript to: an Ancestry.co.uk signup page (because I'm in the UK).

I guess there are two obvious options here:

...and the latter is more likely, realistically!