Open jmckenna opened 1 year ago
ODISCat purpose is to list your organization's Products or Services
it's to list data sources - what those data sources describe is secondary. One organisation / individual in OceanExpert can be linked to one or more ODISCat data source entries.
These can be APIs, web services, portals, or any other mechanism through which (meta)data can be acquired
For reference, the new ODISCat pattern template (that we had together drafted on 2023-08-17) is here: odisCatOrganization-example.json (link updated on 2024-04-26 to new repo)
see #308
should not be closed, #308 is about the format of the pattern, not about ODISCat providing it
setting urgency label here (@pbuttigieg please adjust as necessary - I created 3 new labels for urgency
)
reporting an internal ODISCat issue here as well, as it applies to this ticket:
@arnounesco some important questions/points for the ODISCat-ODIS connection:
url
value for itemOffered
, (see template), which will point to the ODIS-Arch URL
value from the ODISCat entry ready-for-harvest-into-ODIS
disable-ODIS-harvest
, as over time a partner's endpoint could become unmaintained, therefore affecting the ODIS graph/searches
(to discuss in tomorrow's WP2 meeting)related to https://github.com/iodepo/ODISCat/issues/103
cc @pbuttigieg
@arnounesco I've updated the ODISCat JSON-LD template with @pbuttigieg's changes (to use @type CreativeWork
)
@arnounesco This looks good.. just one small issue...
{
"@context": {
"@vocab": "https://schema.org/"
},
"@id": "https://catalogue.odis.org/view/256
",
"@type": "Organization",
"email": "info@ico",
There is a control character \n at the end of the @id value. Would not be an issue in the object literals, but in the subject IRI it's not a valid character.
I can parse such things out of course client side, but better to have it valid server side.
Note that google validator (https://validator.schema.org/#url=http%3A%2F%2Fcatalogue.odis.org%2Fview%2F256) fixes such things. Sometimes I kinda wish they wouldn't. Or at least have a "strict" mode.
If you can use a trim function on the strings or something like that, it is likely a simple fix.
Thanks Doug
@fils I cannot reproduce this, how did you get that content? Tried to view the code or to download, nowhere there is a newline. Also in the code there is nowhere a newline to be seen. This all does not mean you are wrong, but I cannot check what would be the result of any action I take.
I also cannot reproduce. (I use the command :set list
inside vi
on Ubuntu, to show hidden characters for the test entry)
curl -OL https://catalogue.odis.org/view/256
vi 256
:set list
gives:
{$
"@context": {$
"@vocab": "https://schema.org/"$
},$
"@id": "https://catalogue.odis.org/view/256",$
"@type": "Organization",$
"email": "info@xxxx",$
Interesting.. I see what you are both seeing too.
Let me check if the python library is messing something up. There might be a processing setting I need to play with.
so tried with with extrunct rather than BeautifulSoup and I still see it.
I get
{
"@context": {
"@vocab": "https://schema.org/"
},
"@id": "https://catalogue.odis.org/view/263\n ",
"@type": "Organization",
"email": "nodc@meteo.ru",
So now you see the \n with spaces or a tab after it..
I'm trying to resolve why I see this in python, with two different libraries, but you don't see it in vi.
@jmckenna @arnounesco
really odd, if I look in the "source view" of the browser it looks fine.. but no matter how I pull it down with python, I get
{'@context': {'@vocab': 'https://schema.org/'}, '@id': 'https://catalogue.odis.org/view/257\n ', '@type': 'Organization', 'email': 'nodc@meteo.ru', 'contactPoint':
with the \n in the @id string.
Still trying to explain this.
OK, I think I found it. The python package "response" seems to be the issue, I replaced it with httpx and that seems to be working now. Very odd, but no interest in resolving the issue with that package, will simply use httpx.
Thanks!
FYI, I also indexed with Gleaner, which did work but did find 1 error in the record at https://catalogue.odis.org/view/1105 Which is confirmed at: https://validator.schema.org/#url=https%3A%2F%2Fcatalogue.odis.org%2Fview%2F1105
Gleaner reports
Error in unmarshaling json: invalid character ' ' in string escape code"
So, I went ahead and actived the github action for the configuration builder for ODIS Cat.
After a for typos in the requirements.txt it seems to be working but there is an
odd regression in YAML output. Need to check the version of python and the libraries
installed in the action VM.
There also seems to be an odd error condition when the generated config file doesn't have any changes from the previous version. Reviewing this.
In the end, there are some items we use in the config file that are not currently in the ODIS Catalog properties.
Will build a list of thess for this issue.
The yaml issue is resolved, code generates now.
Some observations:
Note:
Refs:
cc @arnounesco @pbuttigieg @fils