nikhil1raghav / kindle-send

Send webpages, documents and bookmarks to kindle
GNU Affero General Public License v3.0
211 stars 25 forks source link

dial tcp: lookup ws-na.amazon-adsystem.com: no such host missing data prefix #26

Open volkerwestphal opened 1 year ago

volkerwestphal commented 1 year ago

Describe the bug Kindlesend stops processing after errors encountered while parsing the url https://matt.might.net/articles/hello-perceptron/.

To Reproduce Steps to reproduce the behavior: $ kindle-send.exe -dry-run -title temp -url https://matt.might.net/articles/hello-perceptron/

Expected behavior Kindlesend should send the content of this web page, regardless of the incorrectly linked image

Screenshots

C:\Users\Username\Work\Themen\Kindlesend>kindle-send.exe -dry-run -title temp -url https://matt.might.net/articles/hello-perceptron/
Config home not set, will look for config at  C:\Users\Username/.config/kindle-send
Loaded configuration
Fetched https://matt.might.net/articles/hello-perceptron/ --> Hello, Perceptron: An introduction to artificial neural networks
Embedding images in  Hello, Perceptron: An introduction to artificial neural networks
Downloading Images
Downloaded image https://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=B092J75GML&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=mmamzn06-20&language=en_US
Downloading Images
Downloaded image https://ir-na.amazon-adsystem.com/e/ir?t=mmamzn06-20&language=en_US&l=li3&o=1&a=B092J75GML
Downloading Images
Downloaded image https://matt.might.net/images/blog/linear-separability-AND.png
Downloading Images
Downloading Images
Setting img src from https://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=B092J75GML&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=mmamzn06-20&language=en_US to ../images/img6021680474870233255.png
Setting img src from https://ir-na.amazon-adsystem.com/e/ir?t=mmamzn06-20&language=en_US&l=li3&o=1&a=B092J75GML to ../images/img2824003947963501991.png
Setting img src from https://matt.might.net/images/blog/linear-separability-AND.png to ../images/img932481256471577820.png
Setting img src from https://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=B092J75GML&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=mmamzn06-20&language=en_US to ../images/img6021680474870233255.png
Setting img src from https://ir-na.amazon-adsystem.com/e/ir?t=mmamzn06-20&language=en_US&l=li3&o=1&a=B092J75GML to ../images/img2824003947963501991.png
Added 1 articles
Error retrieving "https://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=B092J75GML&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=mmamzn06-20&language=en_US" from source:
 open https://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=B092J75GML&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=mmamzn06-20&language=en_US: Die Syntax für den Dateinamen, Verzeichnisnamen oder die Datenträgerbezeichnung ist falsch.
 Get "https://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=B092J75GML&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=mmamzn06-20&language=en_US": dial tcp: lookup ws-na.amazon-adsystem.com: no such host
 missing data prefix
Dry-run mode : Not sending files to device
Following files are saved

Desktop (please complete the following information): Version: v1.0.3 BuildDate: 2022-09-27T06:47:46Z Platform: windows/amd64

Additional context The issue was detected on June 20, 2023. The website might change or update it's content.

nikhil1raghav commented 1 year ago

Hi, thanks for reporting this. This issue is not there in the latest pre-release 2.0.0-rc-1 version. Can you try that one out and confirm?

Commands are a bit different in that version. I tried kindle-send send https://matt.might.net/articles/hello-perceptron/ and was able to send the epub. Then I tried kindle-send download https://matt.might.net/articles/hello-perceptron/ and was able to download the correct epub.

volkerwestphal commented 1 year ago

Hi,

the new version of kindle-send still refuses to download, giving the "SKIPPING https://matt.might.net/articles/hello-perceptron/ : Error retrieving ..." message.

But I think I have an idea why. There is nothing special about this blog post. It just includes an ad and I have a DNS-based ad blocker in place (network-based). My computer is unable to resolve the DNS name for ws-na.amazon-adsystem.com.

I checked the blog post in a browser. Developer tools show that the browser simply skips over the unresolveable URLs and displays the blog without the missing image.

Looks like kindle-send actually requires all elements of a page to be available. Is this something to reconsider?

nikhil1raghav commented 1 year ago

Got it, will debug this further and push the fix in few days.