teamshortcut / Discord-Maths-Bot

Uses the Kings Maths School Seven Day Maths Weekly Challenge and their archive of problems for a Python Discord bot.
MIT License
0 stars 5 forks source link

Display issues - HTML to Markdown #10

Open teamshortcut opened 6 years ago

teamshortcut commented 6 years ago

Images

eg. https://www.kcl.ac.uk/mathsschool/weekly-maths-challenge/previouschallenges.aspx Challenge 18

Subscripts/superscripts

For example, x squared would simply be displayed as x2 Unicode superscript characters: ⁰¹²³⁴⁵⁶⁷⁸⁹ Unicode subscript characters: ₀₁₂₃₄₅₆₇₈₉

https://en.wikipedia.org/wiki/Superscripts_and_Subscripts https://www.fileformat.info/info/unicode/block/superscripts_and_subscripts/list.htm

(harder) could check to see if subscript/superscript character exists in Unicode, rather than manually checking for certain ones. This would allow characters like +⁺₊ to be added without manually filtering, useful for recursive relations or indices problems. Kings Maths School website seems to use <sup> and <sub> currently.

ol and ul

eg. https://www.kcl.ac.uk/mathsschool/weekly-maths-challenge/challenges-81-100.aspx Challenge 81. Because the list elements are children, the code does not catch them. (since it uses .next_sibling and .contents)

Tables

eg. https://www.kcl.ac.uk/mathsschool/weekly-maths-challenge/previouschallenges.aspx Challenge 5


Could be fun to create a sub/superscript library? A general HTML -> Markdown solution would be helpful; these might be useful: https://github.com/kennethreitz/pyandoc https://github.com/Alir3z4/html2text https://github.com/gaojiuli/tomd

teamshortcut commented 6 years ago

Images seem to be stored as relative links, not absolute links.

teamshortcut commented 6 years ago

https://pypi.org/project/pypandoc/ Seems like it could be a good solution.