Sandbox for explorative work around SciCat at DESY

Materials-Data-Science-and-Informatics / PIDA

PIDA is a service for providing web resources with Permanent URLs (PURLs), to ensure they remain available and can be accessed reliably remain available for future access by both humans and machines.

https://purls.helmholtz-metadaten.de/

MIT License

3 stars 5 forks source link

Sandbox for explorative work around SciCat at DESY #60

Open linupi opened 2 months ago

linupi commented 2 months ago

Contact person name

Linus Pithan

Contact person email.

linus.pithan@desy.de

Overview in the README.md

Artifact name

SciCat Datamodel Sandbox

Overview

The purpose of this Artifact is to study the feasibility to combine the pida service together with SciCat and LinkML to serve public data hosted at DESY.

Reach out

Desy Gitlab) additional contact: Aggarwal, Anjali anjali.aggarwal@desy.de

What is your artifact prefix?

scicat-sandbox

Where is the artifact located on the Web (i.e. the URL)?

https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/

Do you want to enable content negotiation?

[X] enable content negotiation

artifact type

HTML, RDF/XML, OWL

saidfathalla commented 2 months ago

Hi Linus, Thanks for choosing PIDA. the provided URL needs authentication and also gives a 404 error. Could you please check?

linupi commented 2 months ago

Dear Said,

thanks for your reply. I actually wanted to reach out to you yesterday but somehow Mattermost wasn't working. Could you point me to an example (repo) that serves the content that you are expecting. It's really not clear to us right now what we have to provide (where we have to host what) so that your service can handel the requests ( HTML, RDF/XML, OWL ) accordingly. Also, would you have an example how we can check that the serving works fine? (e.g. a python script sending the different corresponding requests)

the idea here is really to start with something small and well understood such as an rebuild of the pizza onthology so that what is achievable should be rather clear for everyone.

Cheers, Linus

On Thu, Jun 6, 2024 at 8:55 AM Said Fathalla @.***> wrote:

Hi Linus, Thanks for choosing PIDA. the provided URL needs authentication and also gives a 404 error. Could you please check?

— Reply to this email directly, view it on GitHub https://github.com/Materials-Data-Science-and-Informatics/PIDA/issues/60#issuecomment-2151539426, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYF3ZGOVFJ35MXZYVJHXVDZGABUXAVCNFSM6AAAAABIZGTNUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJRGUZTSNBSGY . You are receiving this because you authored the thread.Message ID: @.*** com>

saidfathalla commented 2 months ago

Hi Linus,

The mwo ontology (developed by us) is a good example for you: the mwo PID (http://purls.helmholtz-metadaten.de/mwo/) is redirected to either 1) the docs: https://nfdi-matwerk.pages.rwth-aachen.de/ta-oms/mwo/docs/index.html# or 2) the OWL file: https://git.rwth-aachen.de/nfdi-matwerk/ta-oms/mwo/-/blob/main/mwo.owl based on the type of the request (specifically, the 'Accept:' field in the header of the request message,)

You can test the redirection using curl cmd as follows (see different results for the same URL :1st_place_medal: ):

if you want to get the html representation of a resource, you might execute

curl -i --location 'http://purls.helmholtz-metadaten.de/mwo/' --header 'Accept: text/html'

but if you want to get the xml representation, you might execute

 curl -i --location 'http://purls.helmholtz-metadaten.de/mwo/' --header 'Accept: rdf/xml'

You can learn more about content negitiation here and the Best Practice Recipes for Publishing RDF Vocabularies is very helpful.

if you need help, please let me know. best, Said

linupi commented 1 month ago

Dear Said,

finally I found time to come back to this. Now there are some artifact available here. Is this the format that you are expecting and does this make sense for your content negotiation? ... the content of the files may still be wrong but that's the next step for me once I know I am on a good track...

https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/pizza/html/
https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/pizza/pizza.json
https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/pizza/pizza.jsonld
https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/pizza/pizza.owl

https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/scicat/html
https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/scicat/scicat.json
https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/scicat/scicat.jsonld
https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/scicat/scicat.owl

saidfathalla commented 1 month ago

hi Linus, these look two different namespaces. I'd expect scicat-sandbox for the following:

https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/scicat/html https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/scicat/scicat.json https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/scicat/scicat.jsonld https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/scicat/scicat.owl

and scicat-pizza for the following:

https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/pizza/html/ https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/pizza/pizza.json https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/pizza/pizza.jsonld https://fs-ec.pages.desy.de/scicat/datamodel-sandbox/pizza/pizza.owl

is that correct?

linupi commented 1 month ago

Hi Said, do we really have to do fully separated namespaces for this? In fact, you are right, conceptually they are two distinct things but since it is all still on shaky ground I was hoping to get away with just sub-directories within one namespace to explore more easily what structure we'd actually need before we kick-off multiple namespaces.

Would it be possible to use sub-directories as separate namespaces in your setting?

PS: Since I will be on leave later this month, I also include @Anjali-Aggarwal0305 in this conversation to make sure we keep the ball rolling this time.