Open tammy-culture opened 2 months ago
@tammy-culture I took a look and this website will be difficult at the moment because of the way the JSON-LD was added. There is nothing wrong in the way they added their JSON-LD, its only because it uses an approach that we don't support yet. I will assign to @dev-aravind so he can start working on it.
@dev-aravind Can you design a technique to crawl this site with a headless browser. The JSON-LD is not in the webpage but added by javascript in the browser. SO the JSON-LD only appears after the page is loaded and javascript on the page has executed. We can discuss. I also propose you use the Orion repo and make the choice of how to crawl into an option.
@dev-aravind Please start with only first page of events.
@saumier The data is now available in artsdata user the artifact name capitolnb-ca
@saumier Question: Is "capitolnb-ca" ready to be added as an aggregator (acapitolnb-ca) to the ArtscultureNB Calendar in Footlight CMS?
@tammy-culture No. Not yet. There are some little things to fix.
@dev-aravind Please work on the following:
capitol-nb-ca
@saumier These issues are now fixed in artsdata. You can find a sample event here.
@saumier Thank-you, please let me know when I can put in a request to add a capitol-nb-ca aggregator.
Status:
@tammy-culture There is another thing with importing capitol.nb.ca. The auto-minting only works right now with CMS and Footlight because those events have URIs. The events on capitol.nb.ca do not have URIs (they are temporary and change with each load).
I have turned off the crawl schedule until I can work on auto-minting/linking with this type of website. I will mint the 11 events that are currently there so they can be loaded into CMS.
@tammy-culture Please request (create an issue) that http://kg.artsdata.ca/culture-creates/artsdata-orion/capitol-nb-ca gets loaded into CMS.
@dev-aravind I am reopening this issue because we need more events than only those on the home page. Can you explore different ways to get all their events. For example, clicking on a month in the calendar or a pagination approach.
I set the priority to high because Tammy's client ArtsCultureNB needs to launch this client by end of October.
@saumier All the events from this year are crawled and available in artsdata right now here.
As per our stand up meeting this morning, we are able to grab approximately 3 to 4 months of upcoming events. This is progress from the the 10 events we were only able to grab previously. We now can grab approximately 30 upcoming events out of the approx. 72 total upcoming events. Tammy to mention to client.
Please grab JSON-LD from the following website (https://capitol.nb.ca/en/tickets-events) and add it to Artsdata.ca.
The reason WHY, is that out client ArtscultureNB would like to include these events in their Production CMS.