bio-guoda / preston

a biodiversity dataset tracker
MIT License
24 stars 1 forks source link

enable pointing into individual Zotero records #287

Closed jhpoelen closed 1 month ago

jhpoelen commented 1 month ago

Currently, #281 enables tracking of paginated json arrays of record provided by Zotero API.

And, to enable easy processing, we'd like to get versions of individual records, not the arrays.

Suggest to split paginated json arrays from Zotero APIs into individual top-level json object. This way, we can point into individual Zotero records for subsequent processing.

Currently the following statements are generated:


<https://api.zotero.org/groups/5435545/items?start=0&limit=100>
 <http://purl.org/pav/hasVersion> <hash://sha256/3536ffe05d4e52a99ce44959bee3696f9f7f9ce957128670811ce2b06a92627a> ```

suggest to include content id references to individual records, creating the ability to more easily refer to object individually. 
jhpoelen commented 1 month ago

Now,

preston track https://www.zotero.org/groups/5532807

yields

<https://preston.guoda.bio> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#SoftwareAgent> <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> .
<https://preston.guoda.bio> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Agent> <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> .
<https://preston.guoda.bio> <http://purl.org/dc/terms/description> "Preston is a software program that finds, archives and provides access to biodiversity datasets."@en <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> .
<urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Activity> <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> .
<urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> <http://purl.org/dc/terms/description> "A crawl event that discovers biodiversity archives."@en <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> .
<urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> <http://www.w3.org/ns/prov#startedAtTime> "2024-05-15T18:17:46.114Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> .
<urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> <http://www.w3.org/ns/prov#wasStartedBy> <https://preston.guoda.bio> <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> .
<https://doi.org/10.5281/zenodo.1410543> <http://www.w3.org/ns/prov#usedBy> <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> .
<https://doi.org/10.5281/zenodo.1410543> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/dcmitype/Software> <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> .
<https://doi.org/10.5281/zenodo.1410543> <http://purl.org/dc/terms/bibliographicCitation> "Jorrit Poelen, Icaro Alzuru, & Michael Elliott. 2021. Preston: a biodiversity dataset tracker (Version 0.8.6-SNAPSHOT) [Software]. Zenodo. https://doi.org/10.5281/zenodo.1410543"@en <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> .
<urn:uuid:0659a54f-b713-4f86-a917-5be166a14110> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Entity> <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> .
<urn:uuid:0659a54f-b713-4f86-a917-5be166a14110> <http://purl.org/dc/terms/description> "A biodiversity dataset graph archive."@en <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> .
<hash://sha256/8c09752b2ac784fd52e3bb5b4cfa33635d1e677fb55667114f414447f26efa5e> <http://www.w3.org/ns/prov#wasGeneratedBy> <urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> <urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> .
<hash://sha256/8c09752b2ac784fd52e3bb5b4cfa33635d1e677fb55667114f414447f26efa5e> <http://www.w3.org/ns/prov#qualifiedGeneration> <urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> <urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> .
<urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> <http://www.w3.org/ns/prov#generatedAtTime> "2024-05-15T18:17:48.444Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> <urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> .
<urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Generation> <urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> .
<urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> <http://www.w3.org/ns/prov#wasInformedBy> <urn:uuid:4fdd8a63-85d7-4621-92a1-c8287782b40b> <urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> .
<urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> <http://www.w3.org/ns/prov#used> <https://www.zotero.org/groups/5532807> <urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> .
<https://www.zotero.org/groups/5532807> <http://purl.org/pav/hasVersion> <hash://sha256/8c09752b2ac784fd52e3bb5b4cfa33635d1e677fb55667114f414447f26efa5e> <urn:uuid:c3293b70-1a26-4762-bfa3-1757f248a000> .
<https://api.zotero.org/groups/5532807> <http://www.w3.org/ns/prov#alternateOf> <https://www.zotero.org/groups/5532807> .
<hash://sha256/a40632c1819e330ebbde8ece5df9abe5448eb218fee150cb6bd96caaa8ccbfa4> <http://www.w3.org/ns/prov#wasGeneratedBy> <urn:uuid:4e8260d8-aa32-4a55-9755-4f377b16b2cf> <urn:uuid:4e8260d8-aa32-4a55-9755-4f377b16b2cf> .
<hash://sha256/a40632c1819e330ebbde8ece5df9abe5448eb218fee150cb6bd96caaa8ccbfa4> <http://www.w3.org/ns/prov#qualifiedGeneration> <urn:uuid:4e8260d8-aa32-4a55-9755-4f377b16b2cf> <urn:uuid:4e8260d8-aa32-4a55-9755-4f377b16b2cf> .
<urn:uuid:4e8260d8-aa32-4a55-9755-4f377b16b2cf> <http://www.w3.org/ns/prov#generatedAtTime> "2024-05-15T18:17:50.075Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> <urn:uuid:4e8260d8-aa32-4a55-9755-4f377b16b2cf> .
<urn:uuid:4e8260d8-aa32-4a55-9755-4f377b16b2cf> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Generation> <urn:uuid:4e8260d8-aa32-4a55-9755-4f377b16b2cf> .
<urn:uuid:4e8260d8-aa32-4a55-9755-4f377b16b2cf> <http://www.w3.org/ns/prov#used> <https://api.zotero.org/groups/5532807> <urn:uuid:4e8260d8-aa32-4a55-9755-4f377b16b2cf> .
<https://api.zotero.org/groups/5532807> <http://purl.org/pav/hasVersion> <hash://sha256/a40632c1819e330ebbde8ece5df9abe5448eb218fee150cb6bd96caaa8ccbfa4> <urn:uuid:4e8260d8-aa32-4a55-9755-4f377b16b2cf> .
<hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c> <http://www.w3.org/ns/prov#wasGeneratedBy> <urn:uuid:49ddb283-a1b0-4c3e-b227-8b658352b76e> <urn:uuid:49ddb283-a1b0-4c3e-b227-8b658352b76e> .
<hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c> <http://www.w3.org/ns/prov#qualifiedGeneration> <urn:uuid:49ddb283-a1b0-4c3e-b227-8b658352b76e> <urn:uuid:49ddb283-a1b0-4c3e-b227-8b658352b76e> .
<urn:uuid:49ddb283-a1b0-4c3e-b227-8b658352b76e> <http://www.w3.org/ns/prov#generatedAtTime> "2024-05-15T18:17:51.612Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> <urn:uuid:49ddb283-a1b0-4c3e-b227-8b658352b76e> .
<urn:uuid:49ddb283-a1b0-4c3e-b227-8b658352b76e> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Generation> <urn:uuid:49ddb283-a1b0-4c3e-b227-8b658352b76e> .
<urn:uuid:49ddb283-a1b0-4c3e-b227-8b658352b76e> <http://www.w3.org/ns/prov#used> <https://api.zotero.org/groups/5532807/items?start=0&limit=100> <urn:uuid:49ddb283-a1b0-4c3e-b227-8b658352b76e> .
<https://api.zotero.org/groups/5532807/items?start=0&limit=100> <http://purl.org/pav/hasVersion> <hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c> <urn:uuid:49ddb283-a1b0-4c3e-b227-8b658352b76e> .
<cut:hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c!/b7-2980> <http://purl.org/dc/elements/1.1/format> "application/json+zotero" .
<hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c> <http://www.w3.org/ns/prov#hadMember> <cut:hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c!/b7-2980> .
<cut:hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c!/b7-2980> <http://purl.org/pav/hasVersion> <cut:hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c!/b7-2980> .
<cut:hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c!/b2987-7188> <http://purl.org/dc/elements/1.1/format> "application/json+zotero" .
<hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c> <http://www.w3.org/ns/prov#hadMember> <cut:hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c!/b2987-7188> .
<cut:hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c!/b2987-7188> <http://purl.org/pav/hasVersion> <cut:hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c!/b2987-7188> .

Note that the group contains two records, each delimited by

cut:hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c!/b7-2980

and

cut:hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c!/b2987-7188
jhpoelen commented 1 month ago

with

preston cat 'preston cat 'cut:hash://sha256/a54fbd3bc1eba272cdba5ba4f4c121c1ee45eee62252f8b1a7afda75c1545c7c!/b7-2980''

producing -

{
        "key": "STX9TZD9",
        "version": 5,
        "library": {
            "type": "group",
            "id": 5532807,
            "name": "preston",
            "links": {
                "alternate": {
                    "href": "https://www.zotero.org/groups/preston",
                    "type": "text/html"
                }
            }
        },
        "links": {
            "self": {
                "href": "https://api.zotero.org/groups/5532807/items/STX9TZD9",
                "type": "application/json"
            },
            "alternate": {
                "href": "https://www.zotero.org/groups/preston/items/STX9TZD9",
                "type": "text/html"
            }
        },
        "meta": {
            "createdByUser": {
                "id": 4163804,
                "username": "jhpoelen",
                "name": "",
                "links": {
                    "alternate": {
                        "href": "https://www.zotero.org/jhpoelen",
                        "type": "text/html"
                    }
                }
            },
            "creatorSummary": "Elliott et al.",
            "parsedDate": "2020",
            "numChildren": 0
        },
        "data": {
            "key": "STX9TZD9",
            "version": 5,
            "itemType": "journalArticle",
            "title": "Toward reliable biodiversity dataset references",
            "creators": [
                {
                    "creatorType": "author",
                    "firstName": "Michael J.",
                    "lastName": "Elliott"
                },
                {
                    "creatorType": "author",
                    "firstName": "Jorrit H.",
                    "lastName": "Poelen"
                },
                {
                    "creatorType": "author",
                    "firstName": "José A.B.",
                    "lastName": "Fortes"
                }
            ],
            "abstractNote": "",
            "publicationTitle": "Ecological Informatics",
            "volume": "59",
            "issue": "",
            "pages": "101132",
            "date": "09/2020",
            "series": "",
            "seriesTitle": "",
            "seriesText": "",
            "journalAbbreviation": "Ecological Informatics",
            "language": "en",
            "DOI": "10.1016/j.ecoinf.2020.101132",
            "ISSN": "15749541",
            "shortTitle": "",
            "url": "https://linkinghub.elsevier.com/retrieve/pii/S1574954120300820",
            "accessDate": "2024-05-15T18:10:32Z",
            "archive": "",
            "archiveLocation": "",
            "libraryCatalog": "DOI.org (Crossref)",
            "callNumber": "",
            "rights": "",
            "extra": "",
            "tags": [],
            "collections": [],
            "relations": {},
            "dateAdded": "2024-05-15T18:10:33Z",
            "dateModified": "2024-05-15T18:10:33Z"
        }
    }
jhpoelen commented 1 month ago

Ability to point to individual Zotero record has been made available in https://github.com/bio-guoda/preston/releases/tag/0.8.5 .