Closed alvinsw closed 1 year ago
Can you please provide a very basic crate with a basic root dataset and one entity using an arcp identifier for testing.
An example of a very basic crate:
{
"@context": "https://w3id.org/ro/crate/1.1/context",
"@graph": [
{
"@type": "CreativeWork",
"@id": "ro-crate-metadata.json",
"identifier": "ro-crate-metadata.json",
"about": {
"@id": "arcp://name,cooee-corpus/corpus/root"
},
"conformsTo": {
"@id": "https://w3id.org/ro/crate/1.1"
}
},
{
"@id": "arcp://name,cooee-corpus/corpus/root",
"@type": "Dataset",
"@reverse": {},
"name": "Test",
"hasMember": [
{
"@id": "arcp://name,cooee-corpus/item/1-001"
}
]
},
{
"@id": "arcp://name,cooee-corpus/item/1-001",
"@type": "RepositoryObject",
"conformsTo": {
"@id": "https://purl.archive.org/language-data-commons/profile#Object"
},
"name": "Text 1-001 1788 Phillip, Arthur"
}
]
}
You can see that arcp://name,cooee-corpus/item/1-001
becomes a Thing and #
is added as a prefix in the editor. To see the performance issue, please use the previously attached zip file as the test data.
There are two issues here.
The code wasn't handling arcp's correctly. That is now fixed.
The slowness was not due to the arcp bug but the massive entity lists on the properties (e.g. hasPart). The issue is that the browser was being crushed trying to render all those DOM elements. The only fix I can think of - that is in this commit - is to paginate those massive lists. So the code now has a default page size of 50 elements and a filter box to filter the list. I figure most people don't need to see the whole list at once. Pagination deals with that. And most people are probably looking for something in the list so filtering solves that.
I tried setting a large page size but when you have a few properties with large arrays on them then that adds up quite significantly in terms of browser load. 50 per page seems to be a reasonable default for now.
I've just built v0.23.0
with this code.
Just built 0.23.1 to fix a couple of issues in the new code.
Thanks @marcolarosa I can confirm that the arcp bug is fixed. I think this issue can be closed.
When a crate has an entity that uses arcp protocol as its id, the validator library fails it, which causes it to create a whole bunch of new objects with id prefixed by
#
. The main issue is with the validator library:If the id fails the validation check, the id will be replaced in this line
entity["@id"] = `#${entity["@id"]}`;
. From there, it will cause a different bigger problem especially when the data is big. A lot of new entities are being created which makes loading takes forever. I think before a new entity is created and pushed, it should be checked first if there is already existing one.Test data is attached here as zip file: ro-crate-metadata.zip