geolexica / isotc211.geolexica.org

ISO/TC 211 online version of the Multi-Lingual Glossary of Terms
https://isotc211.geolexica.org
4 stars 2 forks source link

erroneous characters "<>" appearing in all definitions across all languages #200

Closed ReesePlews closed 11 months ago

ReesePlews commented 11 months ago

hello, i am just noticing some erroneous characters "<>" appearing in all definitions across all languages in tc211 geolexica i did not notice them earlier this week or last week.

can you check this at your earliest convenience. thank you.

abbreviation_MLGT_The_authoritative_multi-lingual_geographic_information_terminology_database_-_2023-10-17_20_26_11

ronaldtse commented 11 months ago

@HassanAkbar can you please help fix this ASAP? I think this is the empty "domain" being shown.

ronaldtse commented 11 months ago

Thank you @ReesePlews for spotting this!

ReesePlews commented 11 months ago

Thank you @ronaldtse no problem. sorry i did not notice before. when i check the other pages, i note that the domains (< >) are shown after the concept/term. https://isotc211.geolexica.org/concepts/43/

image

however regular ISO terminological styling rules, and in the OBP the domain showing at the start of the definition i wonder if it was shifted in geolex for some reason?

image

perhaps the issue is two-fold. when a domain is present they should be shown at the start of the definition (not at the end of the concept string) but when there is no domain, then of course those <>s should not be there.

additionally, for processing into the excel repo i will shift the position of the domain to appear after the concept. however if we follow ISO terminological styling (which geolex does) it should be at the front of the definition.

Ron, does this follow your understanding also?

@HassanAkbar let me know if you have any questions. thank you.

strogonoff commented 11 months ago

@ReesePlews Setting aside the matter of anomalous empty <> markers (which appear to be a regression in Geolexica that should obviously be addressed), is that particular detail of terminological styling rules part of a standard we can and should refer to?

  1. I believe it is common dictionary style that <subject domain> (or “parent concept” in a knowledge graph) is placed directly before a definition only when it’s shown on a page that presents one designation and multiple distinct concepts (and therefore multiple definitions) that share that designation:

    Screenshot 2023-10-18 at 14 42 18
  2. However, in cases where the page has only one concept, the <domain> may be separated from definition and placed higher in visual hierarchy (closer to the designation) for better legibility, for example:

    Screenshot 2023-10-18 at 14 36 36

In Geolexica’s current implementation, a given page always represents a single concept, as in (2). Distinct concepts (possibly under different respective subject domains) always reside on different pages, so the situation like on first screenshot is not possible. An assumption was that in well-managed glossaries it’d be unusual for more than one concept to share the exact same designation, so grouping by designation is not particularly important.

However, I recall discussions that Geolexica should provide per-designation pages, which can show multiple distinct concepts on the same page in the unlikely case that multiple concepts do happen to have the same or very similar designation. I believe when such feature is implemented, on multi-concept designation pages, as in (1), Geolexica should indeed follow the style on your screenshot and place the domain together with concept definition rather than with the designation.


All that said, if there exists an applicable standard prescribing that <domain> markers are definitely positioned in close proximity to definitions even if the page does not describe more than one distinct concept, then we should update the layout to match that standard regardless of the above.

ReesePlews commented 11 months ago

thank you @strogonoff , i have not seen the notation from the images you have pasted into the reply above. what document are they from?

normally i suggest in tc211, that authors consult the ISO/IEC Directives Part 2 (clause 16) and then clause 16.6 Figure 1 https://www.iso.org/sites/directives/current/part2/index.xhtml#_idTextAnchor231 for an example of a well formed terminological entry.

however these style constructs are defined elsewhere. i want to say ISO 10241-1 Table 1 or in Annex A for many examples. i believe 10241-1 also defines "domain" / "subject" but i have not checked.

typically people do not have those specialized terminological standards so that is why i tend to guide people to the directives.

if you need me to check more specifically, please let me know

i think we should keep the domain at the start of the definition in < >s.

HassanAkbar commented 11 months ago

@ReesePlews I have updated the code to fix the issue of empty <>. I'm am working on the second issue.

when a domain is present they should be shown at the start of the definition (not at the end of the concept string)

This is because in the yaml file the designation is class <UML> but according to the glossarist model the designation should be class and domain should be UML. @ronaldtse I think we should process this and extract the domain from designation and update the yaml files. I'm working on it right now.

ReesePlews commented 11 months ago

@HassanAkbar thank you for working on this so quickly. i am not expecting any new updates or enhancements, just want it to "got back to how it was" but i will leave those decisions up to you and the team as you are checking the code right now. in the future some revisions will need to be made, but they should be discussed more when there is time. thank you again for working on this.

HassanAkbar commented 11 months ago

@HassanAkbar thank you for working on this so quickly. i am not expecting any new updates or enhancements, just want it to "got back to how it was" but i will leave those decisions up to you and the team as you are checking the code right now. in the future some revisions will need to be made, but they should be discussed more when there is time. thank you again for working on this.

Thanks @ReesePlews. No worries, I am already working on fixing the other issue.

ronaldtse commented 11 months ago

@strogonoff the presentation model for Geolexica is ISO 10241-1, so <domain> is always placed ahead of the concept definition.

ronaldtse commented 11 months ago

This is because in the yaml file the designation is class <UML> but according to the glossarist model the designation should be class and domain should be UML. @ronaldtse I think we should process this and extract the domain from designation and update the yaml files. I'm working on it right now.

Thank you @HassanAkbar , the intended treatment you proposed is correct, so we'll wait for that PR. Thanks!

ReesePlews commented 11 months ago

thank you all for checking this so quickly. really appreciate that!

HassanAkbar commented 11 months ago

Thank you @HassanAkbar , the intended treatment you proposed is correct, so we'll wait for that PR. Thanks!

@ronaldtse the PR is ready -> https://github.com/geolexica/isotc211-glossary/pull/34

ronaldtse commented 11 months ago

@HassanAkbar the issue is still present. Can you please check?

https://isotc211.geolexica.org/concepts/43/

Screenshot 2023-10-20 at 12 07 54 PM
ronaldtse commented 11 months ago

Do we need to release the new gem and re-generate the site? Thanks.

HassanAkbar commented 11 months ago

@ronaldtse I've updated and released the related gems and re-generated the site, it is working now.

Here is the PR for extracting domains from terms -> https://github.com/geolexica/isotc211-glossary/pull/34

Screenshot 2023-10-20 at 12 42 56 PM