Closed nleanba closed 4 months ago
manually checked all treatments until 006E64249F0BFFC1A6E1FAC0364EF948 (going alphabetically by ID)
they are all fine*
A notable change is that it changes a bunch of treatsTaxonName
to definesTaxonName
, as far as i can tell this seems correct
some changes that I consider irrelevant or even improvements:
Weird change in 0129163AB112602AFCCAFBD9FE4DF90D:
the publication uri is changed to the one provided as docSource
, which seems not to actually point to the article itself (unlike the uri generated by XSLT which is from ID-DOI
). Is this a bug in the XML or in the XSLT-replacement?
Checked until 016512208D171293F4B2C381B36B1F22
All fine notwithstanding changes mentioned above
I think it can replace the xslt like this, fixing potential other errors when they are noticed.
@retog opinions?
Why is outputProperties
only used in makePublication
?
Why is
outputProperties
only used inmakePublication
?
outputProperties
was the first way ive done it, everything else has been updated to use Subject
, but I didn't think changing it for makePublication
would be worth it.
I've added fish
to the container but I still can't execute the script
root@fbc1a93c34b7:/workspaces/gg2rdf# ./test_noxslt.fish ./ex.xml
File ./ex.ttl doesn't exist! Aborting
root@fbc1a93c34b7:/workspaces/gg2rdf# ls ex.xml
ex.xml
root@fbc1a93c34b7:/workspaces/gg2rdf# ./test_noxslt.fish `pwd`/ex.xml
File /workspaces/gg2rdf/ex.ttl doesn't exist! Aborting
I didn't think changing it for
makePublication
would be worth it.
I tend to disagree. It makes the NOTES at the top of the file inaccurate, and even if that is fixed it still increases complexity. I think the additional time needed when making future changes far outweighs the tedious but relatively small amount of work to make it consistent now.
I've added
fish
to the container but I still can't execute the scriptroot@fbc1a93c34b7:/workspaces/gg2rdf# ./test_noxslt.fish ./ex.xml File ./ex.ttl doesn't exist! Aborting root@fbc1a93c34b7:/workspaces/gg2rdf# ls ex.xml ex.xml root@fbc1a93c34b7:/workspaces/gg2rdf# ./test_noxslt.fish `pwd`/ex.xml File /workspaces/gg2rdf/ex.ttl doesn't exist! Aborting
The error is because it checks for {$ttlReferenceDir}/ex.ttl, which probaly does not exist in the docker container.
./test_noxslt.fish
was never intended to be run in the container, but is only for manual testing. If you wish to use it in a container, you should already be running an interactive shell in the container, install fish from there.
If you wish to simply check what output gg2rdf.ts
produces, just call deno run --allow-read --allow-write ./gg2rdf.ts -i <xml filename> -o <ttl filename>
I didn't think changing it for
makePublication
would be worth it.I tend to disagree. It makes the NOTES at the top of the file inaccurate, and even if that is fixed it still increases complexity. I think the additional time needed when making future changes far outweighs the tedious but relatively small amount of work to make it consistent now.
i have changed this now
Weird that it didn't! I committed from the working dev container.
On February 29, 2024 8:18:01 AM GMT+01:00, nleanba @.***> wrote:
@nleanba commented on this pull request.
On Dockerfile:
i have removed it again, so that the container actually builds again
-- Reply to this email directly or view it on GitHub: https://github.com/plazi/gg2rdf/pull/11#discussion_r1507113075 You are receiving this because you were mentioned.
Message ID: @.***>
Experimenting with 000040332F2853C295734E7BD4190F05. I see the current ttl version has 106 triples,the one generated by the new transformer 118.
These are the additional triples:
<http://taxon-concept.plazi.org/id/Animalia/Saigona_Matsumura_1910> <http://plazi.org/vocab/treatment#hasTaxonName> <http://taxon-name.plazi.org/id/Animalia/Saigona> .
<http://taxon-concept.plazi.org/id/Animalia/Saigona_Matsumura_1910> <http://rs.tdwg.org/dwc/terms/class> "Insecta" .
<http://taxon-concept.plazi.org/id/Animalia/Saigona_Matsumura_1910> <http://rs.tdwg.org/dwc/terms/family> "Dictyopharidae" .
<http://taxon-concept.plazi.org/id/Animalia/Saigona_Matsumura_1910> <http://rs.tdwg.org/dwc/terms/genus> "Saigona" .
<http://taxon-concept.plazi.org/id/Animalia/Saigona_Matsumura_1910> <http://rs.tdwg.org/dwc/terms/kingdom> "Animalia" .
<http://taxon-concept.plazi.org/id/Animalia/Saigona_Matsumura_1910> <http://rs.tdwg.org/dwc/terms/order> "Hemiptera" .
<http://taxon-concept.plazi.org/id/Animalia/Saigona_Matsumura_1910> <http://rs.tdwg.org/dwc/terms/phylum> "Arthropoda" .
<http://taxon-concept.plazi.org/id/Animalia/Saigona_Matsumura_1910> <http://rs.tdwg.org/dwc/terms/rank> "genus" .
<http://taxon-concept.plazi.org/id/Animalia/Saigona_Matsumura_1910> <http://rs.tdwg.org/dwc/terms/scientificNameAuthorship> "Matsumura, 1910" .
<http://taxon-concept.plazi.org/id/Animalia/Saigona_Matsumura_1910> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://filteredpush.org/ontologies/oa/dwcFP#TaxonConcept> .
43a54
<http://taxon-concept.plazi.org/id/Animalia/Saigona_baiseensis_Zheng_2021> <http://rs.tdwg.org/dwc/terms/scientificNameAuthorship> "Zheng & Chen, 2021" .
100a112
<http://treatment.plazi.org/id/000040332F2853C295734E7BD4190F05> <http://purl.org/dc/elements/1.1/title> "Saigona baiseensis Zheng & Chen 2021, sp. nov." .
The additional triples look sound, the previously missing title matches the one on https://treatment.plazi.org/id/000040332F2853C295734E7BD4190F05
I do see two error messages:
Error: Invalid Authority for <http://taxon-concept.plazi.org/id/Animalia/fulgoroidesINVALID>
Error: Invalid Authority for <http://taxon-concept.plazi.org/id/Animalia/fulgoroidesINVALID>
Regarding
`output(...)` should not be assumed to run synchronous,
and all data passed to it should still be valid under reordering of calls.
In turtle the order plays a role with regard to base and prefix, so I think this requirement should be dropped.
I do see two error messages:
Error: Invalid Authority for <http://taxon-concept.plazi.org/id/Animalia/fulgoroidesINVALID> Error: Invalid Authority for <http://taxon-concept.plazi.org/id/Animalia/fulgoroidesINVALID>
Those Errors indicate that it found mentions of these taxa, but it could not figure out their authorities to turn them into citations (augments or deprecates). The rdf output matches the behaviour of the xslt, but with more insight into why it was generated in the way it was.
@retog did you rebase this into main?
this is rather inelegant, as now all my commits are marked as unverified.
i think a merge commit would have been better
@retog did you rebase this into main?
this is rather inelegant, as now all my commits are marked as unverified.
i think a merge commit would have been better
I did. I didn't know about this implication.
Open todos: