glossarist / iev-data

1 stars 1 forks source link

Some `stem:[]` blocks get exported as `dstem:[]` #116

Open ronaldtse opened 3 years ago

ronaldtse commented 3 years ago

Here's a source sample:

algebraic equation giving the number d<i>N</i> of particles of a non-quantized system, the components of velocity of which are comprised in the intervals (<i>u</i>, <i>u</i> + d<i>u</i>), (<i>v</i>, <i>v</i> + d<i>v</i>), (<i>w</i>, <i>w</i> + d<i>w</i>) respectively: <p>
<math>
<semantics>
<mrow>
<mstyle displaystyle='true'>
<mtext>d</mtext><mi>N</mi><mo>=</mo><mi>A</mi><mo>⋅</mo><mtext>exp</mtext><mrow><mo>&lsqb;</mo><mrow>
<mfrac>
<mrow>
<mo>−</mo><mi>m</mi><mrow><mo>(</mo>
<mrow>
<msup>
<mi>u</mi>
<mn>2</mn>
</msup>
<mo>+</mo><msup>
<mi>v</mi>
<mn>2</mn>
</msup>
<mo>+</mo><msup>
<mi>w</mi>
<mn>2</mn>
</msup>
</mrow>
<mo>)</mo></mrow>
</mrow>
<mrow>
<mn>2</mn><mi>k</mi><mi>T</mi>
</mrow>
</mfrac>
</mrow><mo>&rsqb;</mo></mrow><mtext>d</mtext><mtext> </mtext><mi>u</mi><mo>⋅</mo><mtext>d</mtext><mtext> </mtext><mi>v</mi><mo>⋅</mo><mtext>d</mtext><mtext> </mtext><mi>w</mi>
</mstyle>
</mrow>
<annotation encoding='MathType-MTEF'>MathType@MTEF@5@5@+=feaagCart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbbjxAHXgarqqtubsr4rNCHbGeaGqipG0dh9qqWrVepG0dbbL8F4rqqrVepeea0xe9LqFf0xc9q8qqaqFn0lXdHiVcFbIOFHK8Feea0dXdar=Jb9hs0dXdHuk9fr=xfr=xfrpeWZqaaeqabiGaciGacaqadmaadaqaaqaaaOqaaiaabsgacaWGobGaeyypa0JaamyqaiabgwSixlaabwgacaqG4bGaaeiCamaadmaabaWaaSaaaeaacqGHsislcaWGTbWaaeWaaeaacaWG1bWaaWbaaSqabeaacaaIYaaaaOGaey4kaSIaamODamaaCaaaleqabaGaaGOmaaaakiabgUcaRiaadEhadaahaaWcbeqaaiaaikdaaaaakiaawIcacaGLPaaaaeaacaaIYaGaam4AaiaadsfaaaaacaGLBbGaayzxaaGaaeizaiaaykW7caWG1bGaeyyXICTaaeizaiaaykW7caWG2bGaeyyXICTaaeizaiaaykW7caWG3baaaa@5C4B@</annotation>
</semantics>
</math><p> 
where<p>
<math>
<semantics>
<mrow><mstyle displaystyle='true'>
<mi>A</mi><mo>=</mo><mi>N</mi><msup>
<mrow>
<mrow><mo>&lsqb;</mo><mrow>
<mfrac>
<mi>m</mi>
<mrow>
<mrow><mo>(</mo>
<mrow>
<mn>2</mn><mi>π</mi><mo>⋅</mo><mi>k</mi><mi>T</mi>
</mrow>
<mo>)</mo></mrow>
</mrow>
</mfrac>
</mrow><mo>&rsqb;</mo></mrow>
</mrow>
<mrow>
<mrow><mn>3</mn><mo>/</mo><mn>2</mn></mrow>
</mrow>
</msup>
</mstyle></mrow>
<annotation encoding='MathType-MTEF'>MathType@MTEF@5@5@+=feaagCart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbbjxAHXgarqqtubsr4rNCHbGeaGqipG0dh9qqWrVepG0dbbL8F4rqqrVepeea0xe9LqFf0xc9q8qqaqFn0lXdHiVcFbIOFHK8Feea0dXdar=Jb9hs0dXdHuk9fr=xfr=xfrpeWZqaaeqabiGaciGacaqadmaadaqaaqaaaOqaaiaadgeacqGH9aqpcaWGobWaamWaaeaadaWcaaqaaiaad2gaaeaadaqadaqaaiaaikdacqaHapaCcqGHflY1caWGRbGaamivaaGaayjkaiaawMcaaaaaaiaawUfacaGLDbaadaahaaWcbeqaamaalyaabaGaaG4maaqaaiaaikdaaaaaaaaa@44B1@</annotation>
</semantics>
</math><p> 
<i>N</i> is the total number of particles<br>
<i>m</i> is the mass of a particle<br>
<i>T</i> is the thermodynamic temperature<br>
<i>k</i> is the Boltzmann constant
<NOTE – d<i>N</i>/<i>N</i> represents the probability that a particle has its components of velocity within the intervals considered.

Notice that d<i>N</i>/<i>N</i> becomes dstem:[N]/stem:[N]. When we capture the block of d<i>N</i>/<i>N</i> we need to do something... so far, {\w}stem:[...] really means stem:[{\w} ...].

ronaldtse commented 3 years ago

Here's the parsed result:

Original:

<NOTE – d<i>N</i>/<i>N</i> represents the probability that a particle has its components of velocity within the intervals considered.

iev/subregisters/eng/definition/1a75b93e-ad96-5be1-9a95-fcf9712b406e.yaml:

dstem:[N]/stem:[N] represents the probability that a particle has its components...
ronaldtse commented 3 years ago

@skalee is there something you can do here? Thanks!

skalee commented 3 years ago

@ronaldtse How about placing a zero-width space just before stem? That said, d<i>N</i>/<i>N</i> looks buggy in general, perhaps parser should print a warning when it encounters something like that?

skalee commented 3 years ago

Nah, I think my previous comment was stupid. That preceding "d" is a part of mathematical formula and should be in stem, correct? It should rather be: stem:[dN]/stem:[N], or maybe even: stem:[dN/N], correct?

For reference, we're talking about concept 521-01-05 here, but plenty of others are affected too. Full list (including false positives):

``` 102-01-08 102-01-09 102-02-13 102-03-04 102-03-22 102-04-40 102-05-01 102-05-06 102-05-07 102-05-10 102-05-19 102-05-20 102-05-22 102-05-28 102-05-30 102-05-31 102-05-32 102-05-33 102-06-12 102-06-13 103-02-04 103-02-05 103-04-01 103-04-05 103-04-09 103-08-10 103-09-06 103-09-07 103-10-18 112-01-01 112-01-03 112-01-20 112-02-04 112-02-05 112-02-06 112-02-07 112-02-08 112-02-10 113-01-32 113-01-38 113-01-41 113-01-46 113-03-07 113-03-10 113-03-11 113-03-13 113-03-14 113-03-18 113-03-21 113-03-22 113-03-29 113-03-30 113-03-40 113-03-42 113-03-44 113-03-58 113-03-59 113-03-60 113-03-61 113-03-62 113-03-63 113-03-64 113-03-71 113-03-72 113-03-75 113-04-20 113-04-37 113-04-39 113-04-42 113-04-47 113-04-49 113-04-50 121-11-02 121-11-11 121-11-12 121-11-13 121-11-17 121-11-21 121-11-22 121-11-24 121-11-26 121-11-27 121-11-41 121-11-43 121-11-45 121-11-51 121-11-57 121-12-14 121-12-30 121-13-26 131-12-32 131-12-36 131-12-40 131-12-43 131-12-67 141-01-03 161-04-19 192-05-06 192-05-08 192-07-20 192-13-03 221-01-20 221-02-36 221-02-37 221-02-63 221-02-64 221-03-08 221-03-40 221-04-06 221-04-07 351-45-19 351-45-26 351-45-27 351-45-28 351-45-29 351-45-30 351-45-31 351-45-51 351-46-07 351-50-02 351-50-03 351-50-05 351-50-08 351-50-11 351-50-12 351-50-13 351-50-16 351-50-17 351-50-18 351-50-19 351-50-31 351-50-32 351-50-33 351-50-34 351-50-35 351-50-36 351-50-37 351-50-38 351-50-39 395-01-05 395-01-11 395-01-15 395-01-16 395-01-17 395-01-18 395-01-24 395-01-26 395-01-27 395-01-30 395-01-32 395-01-34 395-01-35 395-01-36 395-01-42 395-03-43 521-01-05 521-01-19 521-05-14 521-07-21 521-07-22 523-03-07 523-08-09 561-01-11 561-01-45 561-03-21 603-04-08 702-02-20 702-03-13 702-04-53 702-04-54 702-04-55 702-08-52 705-02-14 705-02-17 705-02-21 705-02-24 705-05-41 712-02-02 712-04-16 712-04-17 713-07-13 726-05-01 726-05-02 726-19-01 726-19-02 731-01-17 731-01-18 731-03-76 731-03-78 731-06-42 801-25-03 815-10-14 815-10-32 815-10-39 815-10-41 815-11-08 815-11-09 815-11-20 815-13-59 815-17-03 821-12-21 845-21-037 845-21-041 845-21-043 845-21-047 845-21-050 845-21-067 845-21-068 845-21-070 845-21-074 845-21-075 845-21-076 845-21-077 845-21-078 845-21-079 845-21-101 845-22-037 845-22-085 845-22-088 845-22-090 845-23-073 845-23-074 845-23-076 845-23-078 845-23-080 845-24-063 845-25-063 845-25-073 845-25-088 845-29-092 845-29-151 845-29-152 845-29-153 845-29-155 845-29-156 845-29-157 881-03-03 881-04-18 881-04-19 881-04-22 881-04-23 881-04-25 881-04-26 881-04-27 881-04-32 881-04-42 881-04-54 881-12-06 881-12-07 881-12-20 881-12-27 881-12-28 881-12-29 881-12-30 881-12-43 881-12-47 881-12-48 881-12-49 881-12-50 881-12-51 881-12-52 881-12-53 881-14-02 ```
ronaldtse commented 3 years ago

That preceding "d" is a part of mathematical formula and should be in stem, correct? It should rather be: stem:[dN]/stem:[N], or maybe even: stem:[dN/N], correct?

Indeed. The latter is the most correct, certainly.