JBlocklove / remarkable-daily-pdf

MIT License
10 stars 1 forks source link

TLS error when trying to download a web page #5

Closed oren closed 1 year ago

oren commented 2 years ago

Steps to reproduce:

sh -c "$(wget https://raw.githubusercontent.com/JBlocklove/remarkable-daily-pdf/main/install.sh -O-)"
cd remarkable-daily-pdf/
chmob 755 rm-sync-pdf
./rm-sync-pdf -u https://deno.com/deploy/docs/hello-world -n deno.pdf

Downaloading file
Connecting to deno.com (34.120.54.55:443)
wget: note: TLS certificate validation not implemented
wget: TLS error from peer (alert code 40): handshake failure
wget: error getting response: Connection reset by peer
JBlocklove commented 2 years ago

Interesting. I assumed this wouldn't be a problem since the NYT Crossword URL I was using to test used https. It gives the same TLS certificate validation not implemented note, but then works just fine. It appears the reMarkable has OpenSSL but something isn't configured right by default. I'll have to dig into this a bit more, since this goes a bit outside of my immediate networking knowledge.

Also, I did wget that URL on my laptop and it doesn't resolve to a pdf file. Pulling that will give you an html file which won't show up on your reMarkable right or at all. The -n flag for the name is just the name that will be displayed in the reMarkable file browser and cannot convert filetypes.

JBlocklove commented 2 years ago

Slight update as I poke into this: the reMarkable gets its wget command from busybox, which is a different implementation than the more "standard" GNU version. The busybox wget doesn't seem to support TLS and so will fail on pretty much any secured connection. It's possible that in order to support this a custom wget will need to be compiled for the reMarkable. I was hoping to avoid needing to install any new binaries to make this work, but we might be SOL otherwise. I'll keep digging for a solution, but coming up with a satisfactory one might take a while.

JBlocklove commented 2 years ago

This should be fixed for now. I've added a binary of GNU wget which works on the remarkable and will allow you to get from websites that use HTTPS. I will probably spend some time later coming up with a better solution, but this should work for now.

Leaving the issue open until I find a cleaner solution.

oren commented 2 years ago

I get ./rm-sync-pdf: line 23: ./gnu-wget: Permission denied

JBlocklove commented 2 years ago

Weird. If you go to the git repo on your remarkable and run ls -l gnu-wget what does it show?

oren commented 2 years ago

-rw-r--r-- 1 root root 3115800 Apr 26 17:16 gnu-wget

JBlocklove commented 2 years ago

Hm, seems like the executable flag permissions didn't push for some reason. I'll figure out why it's not working from a git perspective later, but for now you can fix it by running chmod a+x gnu-wget in that directory.

oren commented 2 years ago

download works!

reMarkable: ~/remarkable-daily-pdf/ ./rm-sync-pdf -u https://deno.com/deploy/docs/hello-world -n deno.pdf
Downaloading file
--2022-04-26 18:14:56--  https://deno.com/deploy/docs/hello-world
Resolving deno.com... 2600:1901:0:6d85::, 34.120.54.55
Connecting to deno.com|2600:1901:0:6d85::|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: '/home/root/.local/share/remarkable/xochitl/4b7ec897-1b69-4019-8c9a-345504d239ac.pdf'

/home/root/.local/share/remarkable/x     [ <=>                                                                  ]  18.92K  --.-KB/s    in 0.03s

2022-04-26 18:14:57 (597 KB/s) - '/home/root/.local/share/remarkable/xochitl/4b7ec897-1b69-4019-8c9a-345504d239ac.pdf' saved [19374]

'doc_metadata.txt' -> '/home/root/.local/share/remarkable/xochitl/4b7ec897-1b69-4019-8c9a-345504d239ac.metadata'
'doc_content.txt' -> '/home/root/.local/share/remarkable/xochitl/4b7ec897-1b69-4019-8c9a-345504d239ac.content'

My rm restarted and i see the new document - deno.pdf. it has an icon with exclamation mark on it. when i click on it I see 'Unable to view. This document contains data that is not supported. I tried to manually reboot the rm but I see the same message.

JBlocklove commented 2 years ago

That goes back to what I said in my first comment on this issue. This project doesn't handle any type of file conversion, so you need to have your url pull a pdf file. The url you're using there will download an html file, which the remarkable can't view. I may be implementing some sort of website epub or pdf conversion in the future, but as it stands this will only work if that url returns a pdf.

oren commented 2 years ago

oh, my bad. i assumed i can give it a webpage and it will convert it to PDF. I didn't see any conversion in your code so I was a bit curious how does it work :smile:

oren commented 2 years ago

converting a website might be a bit tricky to automate. currently I sometime use Firefox 'Read View' or even delete some HTML DOM elements before I print as PDF to ensure the document is readable.

JBlocklove commented 1 year ago

The gnu-wget executable permissions seem to be fixed (I didn't do anything, but I just freshly cloned the repo and it was executable) and I think that was the last outstanding aspect of this issue.