jure / mathtype_to_mathml

Converts equations from MathType format (MTEF) to MathML.
MIT License
25 stars 19 forks source link

Use mathml namespace #3

Closed egh closed 9 years ago

egh commented 9 years ago

mathml needs to have the XML namespace set to be properly formatted. This forces Nokogiri to export mathml and only mathml.

Also, this stops automatically updating the expected XML, so that tests have the possibility of failing.

jure commented 9 years ago

Thanks for your contribution, Erik! Where did you have this issue of incorrect formatting?

I'll check this out tonight, it somehow flew right under my radar.

egh commented 9 years ago

I'm trying to improve the process of translation into HTML via oxgarage by embedding MathML in the docx xml. (It makes sense, I promise :)

In any case, MathML is not MathML unless it has a namespace or is embedded in HTML5.

jure commented 9 years ago

Ha, I'm certain it makes sense, even if it sounds a bit strange. :) Anything you can push upstream to OxGarage? As for the namespace, that makes sense, my test case was HTML5, so I didn't consider this. On tor., 13. okt. 2015 at 19:20 Erik Hetzner notifications@github.com wrote:

I'm trying to improve the process of translation into HTML via oxgarage by embedding MathML in the docx xml. (It makes sense, I promise :)

In any case, MathML is not MathML unless it has a namespace or is embedded in HTML5.

— Reply to this email directly or view it on GitHub https://github.com/jure/mathtype_to_mathml/pull/3#issuecomment-147782855 .

egh commented 9 years ago

Basically I am using the technique here:

https://msdn.microsoft.com/en-us/library/documentformat.openxml.wordprocessing.contentpart%28v=office.14%29.aspx

to embed mathML in the docx file. This way we don't have to post-process the document based on errors reported by oxgarage. (That technique was not working for me)

These are the changes:

https://github.com/TEIC/Stylesheets/pull/117

jure commented 9 years ago

You've overwritten the examples (manually collected by using MathType's native conversion) with actual outputs (by running my code, so not your fault :), so I'll have to fix that. I didn't run specs this way, as the XMLs never matched and will potentially never match exactly, but the visual comparison you get is very useful (that's the testing method I used):

bundle exec ruby spec/html_output.rb > test.html

We have three equations with issues (equation2.bin, equation4.bin, equation14.bin), so I'll look at those as well.

jure commented 9 years ago

Committed here https://github.com/jure/mathtype_to_mathml/commit/936934927c0032fa7a66a97517891f003325a36e

I left the namespace addition in, and I'll make XMLs not be overwritten in RSpec's test, but I removed any other modifications to the expected XML files.

Hope that addresses your issue!

jure commented 9 years ago

And this is the XML equivalence change: https://github.com/jure/mathtype_to_mathml/commit/db2030b16be7e429340b1b53eb615471441ed3d4

This is still not going to work right, because of the subtle differences between MathType's conversion and our own. Once we have a completely working test set (by comparing visually, and including #1), we can switch to comparing the XMLs directly, but for now I'd use the html_output.rb.

egh commented 9 years ago

Thanks, @jure ! I think I misunderstood what your tests were doing. It sounds like you are checking against a reference XML output from mathtype.

Maybe these tests could be marked pending so that the tests pass.