Closed saumier closed 9 months ago
@saumier The workflow and ruby file is up for this webpage, but I accidentally set the artifact name as yardbirdsuite-events the first time. So there is duplicated data in nebula with artifact names yardbirdsuite-com and yardbirdsuite-events. Can you please delete the duplicated entities? Apologies for the inconvenience.
@saumier I wrote a SPARQL in reference to the one you used for scenesfrancophones to fix the blank nodes. But as we had some places too with blank node issues, I made some modifications to it. But the blank nodes are not getting replaced by UUIDs as of now. Can you help me work on this one?
@dev-aravind I took a look at the data, and I think we don't need to replace blank nodes of the places because they are all nested inside the events. I only need a way to access the top level entities. So please use the same SPARQL as you did in IPAA.
I will remove entities with blank nodes from being in the list pages of Nebula. I added them temporarily so you could see that they were there. https://github.com/culturecreates/nebula/commit/4830b6afb5edcd2607654782f4ce12a707de786b
Also, it is not a good idea to use URIs like schema:Event to filter unless you are using RDF inferencing. This is because an event may have a type schema:MusicEvent which is a sub-class of schema:Event. When inferencing is used (turned on), then filtering with schema:Event will include all sub-classes such as schema:MusicEvent and scheme:DanceEvent. We are currently not using inferencing when we run SPARQL on local graphs, so it will not pickup all the sub-classes. There is a way to use inferencing when we work on local graphs, but we don't need to yet. In the schema.org vocabulary loaded into Artsdata, you can see that schema:MusicEvent is a subClassOf schema:Event in Artsdata here
@saumier The blank node replacement is up in this PR along with unit tests. Please review it and let me know if you need any changes.
@dev please check my requested changes in the https://github.com/culturecreates/artsdata-orion/pull/25.
@saumier Other than the eventAttendanceMode and eventStatus vallues. The data looks fine. Let me know what you think
@dev-aravind Looking good. I have one question... Why is the description missing from this event even though it is present in the JSON-LD of the Web Page? It shows as " " in graphdb. https://artsdata-nebula-d1ec887e2637.herokuapp.com/entity?uri=urn%3Auuid%3A55828532-6126-418a-9cd7-1e9e26895590
@saumier The description for this event is a non-breaking space.
@dev-aravind In the web page you scraped I see a description "http://schema.org/description":[{"@value":"Tae Kim, is a pianist, arranger, composer, and educator based in Alberta."}]
So my question remains: Why is the description missing? Here is the link to the web page https://yardbirdsuite.com/shows/tuesday-jam-hosted-by-tae-kim/
We need to figure out at which point the description goes missing. Is it before sending to the Artsdata Databus or after?
@saumier Assigning this to you as the description was retained on re-running the workflow
Looks good.
Yard Bird Suite has events listed on this webpage: https://yardbirdsuite.com/events/
Please name this artifact : yardbirdsuite-com
Link to Huginn https://huginn-staging.herokuapp.com/scenarios/59/diagram
@dev-aravind Please use the existing repo artsdata-orion to add your code and workflows. This will be a repo for many websites which we can do together in the same repo because no one else is collaborating on these. I think we will hit 100 websites before the end of the year, so please consider the structure of the code to not repeat (DRY). If in the future, an organization wants we can split a website off into its own repo, but until then lets make them all in this repo if the repo does not already exist.