glossarist / iev-data

1 stars 1 forks source link

Concepts may contain HTML (case in point: 845-23-076 in English) #117

Open strogonoff opened 3 years ago

strogonoff commented 3 years ago

(cc @skalee and @ronaldtse)

ronaldtse commented 3 years ago

In fact we are letting a lot of HTML go through. In this file: concept-845-23-076.yaml.zip

term: CIE 1976 L^*^a^*^b^*^ colour space
eng:
  id: 845-23-076
...
  definition: "three-dimensional, approximately uniform colour space produced by plotting
    in rectangular coordinates stem:[L]*, stem:[a]*, stem:[b]*, quantities defined
    by the equations:<ul style=\"list-style-type:none;\">* stem:[[[L,*,=,116,f,( Y
...
  notes:
  - "Approximate correlates of lightness, chroma, and hue can be calculated as follows:\n\nCIE
    1976 lightness:<ul style=\"list-style-type:none;\">* stem:[L^( ** ) = 116 f (
    Y // Y_n ) - 16]\n\nwhere<ul style=\"list-style-type:none;\">\n* stem:[f ( Y //
...

We need to fully parse the HTML.

skalee commented 3 years ago

Just to be sure we're on the same page:

Right?

ronaldtse commented 3 years ago

@skalee correct. Regarding the term: L^*^a^*^b^*^, it is probably just not parsed with math, because in the definition: the same text is represented as stem:[L]*, stem:[a]*, stem:[b]* (even though it's still not ideal).

Thanks!