Open hoijui opened 5 months ago
@hoijui thanks! I listed your dump on Wikidata: https://www.wikidata.org/wiki/Q39392701#P4945 .
Please state your update policy. Your dump is 3m old, but this query at https://lov.linkeddata.es/dataset/lov/sparql
prefix dct: <http://purl.org/dc/terms/>
select * { # (max(?upd) as ?updated) {
?x dct:modified ?upd
} order by desc(?upd) limit 20
shows newer stuff:
Is it because the LOV dump is 3m old, or you don't track it regularly?
Thank you for that.. Indeed, I completely neglected that!| I think that happened so, because initially I planned to do this with GitHub Actions, but then moving to codeberg made this more cumbersome, and it got lost. Of course, it is of little use without this, so.. thank you! How would you do it? As codeberg has limited resources, I think it would be good to use a scheduled (e.g. once a day) GitHub action, and push eventual changes over to codeberg. It should be relatively straight-forward, as long as I don;t run into any size or access limitations...
It should now be updated daily (if there are changes) from this repos CI: https://github.com/elevont/lov-dump-updater
... but ... it looks like there is an issue with the blank-nodes. :/ on each data dump, they get assigned different (random) IDs, and this shows up in the diff, of course. So about 1/3 of all lines show up as changed. That is of course not meaningful, nor maintainable over time. Any idea for how to solve this? The best way would be to have fixed Ids for blank-nodes (as in, they don't change between data dumps. Are you from the LOV team, by any chance?
https://github.com/atextor/turtle-formatter/issues/8 : there is active development on this tool, and stability of blank nodes is one of the issues being addressed.
You use it as described at https://atextor.de/owl-cli/main/snapshot/usage.html#write-command
I'm not from the LOV team, if indeed there is such.
Available here: https://codeberg.org/elevont/lov-dump
I did this, because for software relying on this, it comes in handy to be able to include it as a git sub-module, instead of having to download it before or during the build process. it prevents needless re-downloads, security policy bells ringing, and many other, similar issues.
I did it on codeberg.org, because both GitHub and Gitlab.com have 100MB blob size limits, and this is 208MB