brechtm / rinohtype

The Python document processor
http://www.mos6581.org/rinohtype
GNU Affero General Public License v3.0
506 stars 61 forks source link

Math Support #222

Open hamzamohdzubair opened 3 years ago

hamzamohdzubair commented 3 years ago

What all would need to be done for math support in rinohtype, If I can understand, what exactly needs to be done, i might be able to help.

hamzamohdzubair commented 3 years ago

Few questions regarding this:

  1. Have we decided to completely steer away from latex rendering
  2. Should we look at Mathjax
  3. How will the input be, \symbol_name or some other format
  4. Can we use this as our guide? Latex Comprehensive guide

All these questions are more policy-based, developers and contributors might already have formed an opinion on this, i am new to this, in-fact i am new to open-source contribution in general, so i need some help to start.

hamzamohdzubair commented 3 years ago

Do we need a mechanism to work with sphinx.ext.pngmath (maybe deprecated) or sphinx.ext.imgmath from here

brechtm commented 3 years ago

At this point I don't have a good idea of the amount of work that is required for implementing maths rendering. I am pretty sure it will take significant effort. I would like to do this the proper way, implemented in pure Python, not relying on any external tool like TeX or Mathjax.

MathJax could be used as a stop-gap solution. You can even run JavaScript on Python using Js2Py. I have used Js2Py in the past successfully, but it was a pain unfortunately. I think time is better spent on a Python implementation.

I looked into input syntax years ago and decided on adopting the (La)TeX syntax first since that is widely used and relatively easy to read and edit. MathML, being XML, is just too verbose, IMO. Eventually, both (and more) formats could be supported, of course.

To sum things up: I think this will require a lot of effort and it's unlikely to be implemented any time soon. I did consider contacting NumFOCUS whether rinohtype would be eligible for receiving sponsorship for the development of this feature, but I did not look into this yet. Reading about fiscal sponsorship on their site, the following requirement is problematic as this point, since rinohtype is still mostly a one-man show (though @alexfargus did contribute quite a number of patches recently!):

More than one contributor, as it’s highly unlikely a project will be accepted without a community of active contributors

It's also very difficult to estimate how many people are using rinohtype. PyPI stats shows some activity, it's impossible to interpret. I we're facing a chicken-and-egg problem in that there isn't yet enough value to attract users, and too few users to build a community.

Note that rinohtype did have some support for typesetting maths at one point. That relied on matplotlib (https://github.com/brechtm/rinohtype/issues/136#issuecomment-459678723), but I removed it (when moving from PostScript to PDF output IIRC) since the types of maths matplotlib could render was very limited. Perhaps it is a good idea to look into restoring that (basic) functionality, just to be able to offer something until proper maths rendering is available?

brechtm commented 3 years ago

Do we need a mechanism to work with sphinx.ext.pngmath (maybe deprecated) or sphinx.ext.imgmath from here

It may be relatively easy to add support for PDF output (using dvipdfm) to sphinx.ext.imgmath so that rinohtype can include its output in a higher quality than PNG. Apparently there is support for setting the baseline for inline maths (imgmath_use_preview), so the result may turn out to be acceptable.

I haven't used sphinx.ext.imgmath yet. Does that work as-is with rinohtype?

alexfargus commented 3 years ago

This piqued my interest so I have done a bit of digging.

Assuming that the subset of math support provided by matplotlib is a useful addition (a quick search was unable to find any information about how large this subset is) then I reckon that it should be possible to get a pure python solution without having to write everything from scratch.

The matplotlib mathtext module largely consists of 3 files:

In addition, matplotlib supports custom backends. All they need to do is draw lines and characters at specific points. rinohtype is rather good at this, so I imagine that putting together a minimally compatible backend should be achievable.

I would propose to fork from matplotlib, drop everything except mathext and then add rinoh/sphinx integration.

This could be done in several steps:

  1. Add back support for mathext from matplotlib - include regression tests for future changes
  2. Fork matplotlib and strip out everything that is not important to rinohtype
  3. Implement a rinoh backend for the matplotlib fork
  4. (optionally) bring the fork into mainline rinoh

I imagine that I am oversimplifying some rather large pieces of work, but I think this outlines a plan that is at least feasible to achieve in a finite amount of time. @brechtm what do you think?

I would be willing to have a stab at this (or collaborate with @hamzamohdzubair if he is still interested in contributing) if you think it is a valuable addition - but obviously, I make no promises of success ;)

brechtm commented 3 years ago

I can't remember exactly how incomplete the Maplotlib math rendering is. I think it could only handle relatively simple things. But that could still be useful, of course.

I had a quick look and I saw that I just included the matplotlib files you list (except for _mathtext.py, maybe it was split up since then?) in the rinohtype codebase. I removed all of it in f82ed7fc812205df2dea24394fbf6acc7c904872. That would probably be a good place to start. I can imagine quite a lot has changed since then, however. The history of the math.py and mathtext.py files goes back to 2011 (back when the module was still named pyte).

I think it's fine to include the required matplotlib files in the rinoh package for now, to keep things simple. We should add a message at the top of those files and the README (License section) referencing the original project (like hyphenator.py and purepng.py). Later we can see whether the matplotlib team is interested in splitting it off into a separate package.

It's difficult to say how much effort this will take, so it might be better spent at starting a new implementation from scratch making use of OpenType math support included in specific fonts.

hamzamohdzubair commented 3 years ago

I was looking at OpenType. I found this: fontools. fonttools has a tool called TTX, that can convert OpenType fonts to and from an XML text format. Does this seem like the right direction?

brechtm commented 3 years ago

I was looking at OpenType. I found this: fontools. fonttools has a tool called TTX, that can convert OpenType fonts to and from an XML text format. Does this seem like the right direction?

I'm afraid not. rinohtype has it's own code for loading OpenType fonts. Loading the MATH table should be relatively simple (see the code for other tables). But of course, the meat of the implementation will be in the placement of the glyphs.

cheekyshibe commented 1 year ago

I'm using mathjax and myst-parser to write math content in my Sphinx project. It would be nice to be able to include these mathematical formulas in the exported pdf file using rinohtype. This is indeed a difficult problem to solve.