EBISPOT / hancestro

https://ebispot.github.io/hancestro/
Creative Commons Attribution 4.0 International
6 stars 2 forks source link

AfPO_0000285 resource URL contains non-ASCII character #58

Open jggatter opened 1 week ago

jggatter commented 1 week ago

I've encountered an error when trying to parse the latest hancestro.owl via the Python package, pronto. Please see https://github.com/fastobo/fastobo-py/issues/343

Would HANCESTRO consider replacing the resource URL https://en.wikipedia.org/wiki/Efé_people with the URL-safe version, https://en.wikipedia.org/wiki/Ef%C3%A9_people?

daniwelter commented 1 week ago

@jggatter Thank you for raising this issue and sorry to hear our latest release is causing you parsing issues. As the property in question originates from AfPO rather than HANCESTRO, I've asked the AfPO team to look into implementing the fix.

anitacaron commented 3 days ago

Hi @jggatter, haven't you had a problem with two dbpedia classes: https://dbpedia.org/page/Réunion and https://dbpedia.org/page/São_Tomé_and_Príncipe?

After fixing the issue in AfPO, I tested parsing using fastobo and got an error on the Réunion class. Then, I found issues with São Tomé and Príncipe, which are defined in HANCESTRO.

jggatter commented 1 day ago

Thanks @daniwelter!

Hi @anitacaron, it seems like those two classes you mention appear in the hancestro.owl file before Efé people. From my original attempt, pronto errored out on me for Efé people and was unable to continue. I suppose the parsing might not happen in a linear order of file start to file end (probably parses as a tree). In any case, it doesn't surprise me that these other resources that contain non-ASCII characters caused the same issue.

In the fastobo issue I linked, the maintainer replies saying that they will add UTF-8 support. Not sure when this would be. Even still, they advise that ontologies use the ASCII character set to better assure support across operating systems.

I've had our organization revert to using a previous HANCESTRO version for now, so we're not blocked. I look forward to eventually upgrading to include the AfPO contributions!

Thanks, James