ml4ai / skema

SKEMA: Scientific Knowledge Extraction and Model Analysis
https://ml4ai.github.io/skema/
Other
10 stars 4 forks source link

Relax `/mml/amr` constraints to handle pMML output from `/image/mml` #289

Closed myedibleenso closed 1 year ago

myedibleenso commented 1 year ago

The current version of the /mml/amr endpoint is unable to handle the following pMML produced from Lotka-Volterra equation images by the /image/mml endpoint:

{
  "mathml": [
    "<math> <mfrac> <mrow> <mi> d </mi> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> </mrow> <mrow> <mi> d </mi> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> </mrow> </mfrac> <mo> = </mo> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03B1 </mi> </mrow> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> <mo> &#x2212 </mo> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03B2 </mi> </mrow> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> <mo> = </mo> </math>", 
    "<math> <mfrac> <mrow> <mi> d </mi> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> </mrow> <mrow> <mi> d </mi> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> </mrow> </mfrac> <mo> = </mo> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03B1 </mi> </mrow> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> <mo> &#x2212 </mo> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03B2 </mi> </mrow> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> <mo> = </mo> </math>"
  ], 
  "model": "regnet"
}

A self-contained (non-working) example is provided at the end of this notebook: https://github.com/ml4ai/ASKEM-TA1-DockerVM/blob/c2404523e7bb0a0af7f773ac95881bb8f1700da7/equations-rest/notebooks/eqn2amr.ipynb

adarshp commented 1 year ago

Can we update the img2mml pipeline to not have extra whitespace within an element? That is, instead of producing <mo> + </mo>, it would be better to produce <mo>+</mo>. Otherwise the parsing becomes more involved, for little benefit.

adarshp commented 1 year ago

Actually, on further thought, I think we should update the MathML parser to handle the whitespace, since it seems like it will be rendered properly on the web.

myedibleenso commented 1 year ago

Actually, on further thought, I think we should update the MathML parser to handle the whitespace, since it seems like it will be rendered properly on the web.

This may impact plans for https://github.com/ml4ai/skema/issues/291

Are you changes also going to address tolerance for comments? If so, I think I can close #291 .

adarshp commented 1 year ago

@myedibleenso - I don't plan to address comments with these changes, since the img2mml pipeline doesn't produce them.

myedibleenso commented 1 year ago

Closed by https://github.com/ml4ai/skema/pull/290 . I think this puts the ball back into img2mml 's court. 🤞 that the new models address some of the issues we're seeing!