ml4ai / skema

SKEMA: Scientific Knowledge Extraction and Model Analysis
10 stars 4 forks source link

Relax `/mml/amr` constraints to handle pMML output from `/image/mml` #289

Closed myedibleenso closed 1 year ago

myedibleenso commented 1 year ago

The current version of the /mml/amr endpoint is unable to handle the following pMML produced from Lotka-Volterra equation images by the /image/mml endpoint:

  "mathml": [
    "<math> <mfrac> <mrow> <mi> d </mi> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> </mrow> <mrow> <mi> d </mi> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> </mrow> </mfrac> <mo> = </mo> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03B1 </mi> </mrow> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> <mo> &#x2212 </mo> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03B2 </mi> </mrow> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> <mo> = </mo> </math>", 
    "<math> <mfrac> <mrow> <mi> d </mi> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> </mrow> <mrow> <mi> d </mi> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> </mrow> </mfrac> <mo> = </mo> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03B1 </mi> </mrow> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> <mo> &#x2212 </mo> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03B2 </mi> </mrow> <mi> a </mi> <mi> s </mi> <mrow> <mi> &#x03D1 </mi> </mrow> <mo> = </mo> </math>"
  "model": "regnet"

A self-contained (non-working) example is provided at the end of this notebook:

adarshp commented 1 year ago

Can we update the img2mml pipeline to not have extra whitespace within an element? That is, instead of producing <mo> + </mo>, it would be better to produce <mo>+</mo>. Otherwise the parsing becomes more involved, for little benefit.

adarshp commented 1 year ago

Actually, on further thought, I think we should update the MathML parser to handle the whitespace, since it seems like it will be rendered properly on the web.

myedibleenso commented 1 year ago

Actually, on further thought, I think we should update the MathML parser to handle the whitespace, since it seems like it will be rendered properly on the web.

This may impact plans for

Are you changes also going to address tolerance for comments? If so, I think I can close #291 .

adarshp commented 1 year ago

@myedibleenso - I don't plan to address comments with these changes, since the img2mml pipeline doesn't produce them.

myedibleenso commented 1 year ago

Closed by . I think this puts the ball back into img2mml 's court. 🤞 that the new models address some of the issues we're seeing!