GliderGeek / pocket2rm

get articles from read-later platform pocket to the remarkable paper tablet
MIT License
184 stars 15 forks source link

No images or math formulas in epubs #14

Open JaimeGomezGarcia opened 3 years ago

JaimeGomezGarcia commented 3 years ago

None of the imported webpages included images, as mentioned in the Readme.md. Also they don't convert formulas like the ones shown in https://qiskit.org/textbook/ch-appendix/linear_algebra.html

andypillip commented 3 years ago

That's unfortunate.

If you don't know yet, pocket2rm delegates conversion of websites to the Readability library. To Go-Readability to be precise, a go-lang rebuild of Readability. Any conversion errors might be documented here, but would need to be solved upstream in that library.

Concerning the images please refer to #4.

Concerning the math formulas on your mentioned site, there is no way that Readability could transform them, since it's a non-standard notation that a javascript library turns into the proper visual presentation. Readability has access to the HTML, which reads as follows:

$$\begin{pmatrix} x_1 \\ y_1 \end{pmatrix} \ + \ \begin{pmatrix} x_2 \\ y_2 \end{pmatrix} \ = \ \begin{pmatrix} x_1 \ + \ x_2 \\ y_1 \ + \ y_2 \end{pmatrix}$$

The standardised way to mark up math formulas on web pages would be MathML, but since MathML browser-support is poor, that site uses a javascript library (which unfortunately even doesn't use MathML as it's basic notation).

I don't know about Readability's support of MathML.