Disane87 / docudigger

Website scraper for getting invoices automagically as pdf (useful for taxes or DMS)
https://blog.disane.dev
MIT License
56 stars 7 forks source link

Amazon.com Error: No element found for selector: input[type=email] #935

Closed timothevs closed 1 week ago

timothevs commented 3 months ago

Version used: 2.0.6-dev.2 in Docker Host: Ubuntu 22.04 LTS TLD: .com

Using this via Docker - first run here's the debug log:-

[0] [debug] [2024-07-19 15:07:09] [scrape:amazon]:  Options: {
[0]     "logLevel": "debug",
[0]     "debug": true,
[0]     "logPath": "logs",
[0]     "recurring": true,
[0]     "recurringCron": "*/30 * * * *",
[0]     "fileDestinationFolder": "data",
[0]     "fileFallbackExentension": ".pdf",
[0]     "onlyNew": false,
[0]     "username": "timg",
[0]     "password": "e",
[0]     "tld": "com",
[0]     "yearFilter": 2024,
[0]     "pageFilter": 1
[0] }
[0] [debug] [2024-07-19 15:07:09] [scrape:amazon]:  Getting selectors...
[0] [debug] [2024-07-19 15:07:09] [scrape:amazon]:  Selectors: {
[0]     "orderCards": "div.order.js-order-card",
[0]     "invoiceSpans": "span.hide-if-no-js .a-declarative[data-action=\"a-popover\"]",
[0]     "orderNr": ".yohtmlc-order-id span:nth-last-child(1) bdi",
[0]     "orderDate": ".a-column .a-row:nth-last-child(1) span",
[0]     "popover": "#a-popover-content-{{index}}",
[0]     "invoiceList": "ul.invoice-list",
[0]     "invoiceLinks": "a[href*=\"invoice.pdf\"]",
[0]     "pagination": "ul.a-pagination li.a-normal:nth-last-child(2) a",
[0]     "yearFilter": "select[name=\"timeFilter\"] option",
[0]     "authError": "#auth-error-message-box .a-unordered-list li",
[0]     "authWarning": "#auth-warning-message-box .a-unordered-list li",
[0]     "captchaImage": "div.cvf-captcha-img img[alt~=\"captcha\"]"
[0] }
[0]     Error: No element found for selector: input[type=email]
[0] docudigger scrape all exited with code 1

Thanks!

PS: real username/password removed/replaced.

Disane87 commented 2 months ago

It seems amazon changed the login page and some of their DOM structure. Could you please retest this and, if thats not working, save the page (save as in the browser) and attach this?

It's pretty awkward to test different TLDs with amazon because you need an account and some orders and amazon is changing their site frequently to stop those scrappers. Unfortunatly they have different DOM structures across TLDs.

Disane87 commented 1 week ago

Please try the latest version. There were several fixes. Hope it works for you with that new version.