Open eroux opened 5 years ago
I am not aware of any such project.
This seems like a relatively scoped task we could help with, can you provide some indications about the best way to do so? I think the easiest would be to have a script that would read the data in _data and fetch the labels in the various languages. It would then produce an RDF file that could live in github... does that sound about right?
Depends on what data you need. The files in _data
are updated irregularly and they say what language-label pairs have documentation pages, which is not the same as being a valid label that can be used in a corpus. At least now it is not (yet) the same.
I see, thanks!
Here's another file containing UD rdf:
http://www.acoli.informatik.uni-frankfurt.de/resources/olia/ud-pos-link.rdf
I'll contact the authors
Yes, Christian Chiarcos from Frankfurt is also the main person behind the UD V1 mapping you linked from your first post.
Hi,
apologies for seeing this too late.
For UD data, you can use our CoNLL-to-RDF roundtripping with CoNLL-RDF. This includes support for CoNLL-U, and since December 2019, also CoNLL-U+: https://github.com/acoli-repo/conll-rdf
For UD tagsets, these are covered by OLiA (UD v.1 so far, UD v.2 is still experimental. should be stable before May): https://github.com/acoli-repo/olia/tree/master/owl/stable/ud*; upcoming revision (no linking models yet) under https://github.com/acoli-repo/olia/tree/master/owl/experimental/univ_dep/built-from-html.
An earlier version of UD ontologies was directly generated (on-the-fly!) from the documentation Markdown, but the file structure changed with UD v.2. Prototype (never merged with the main branch) still available under http://fginter.github.io/docs/, click the RDF buttons.
One of my past students (Guilherme Passos) has also worked on UD transformation to RDF. The idea is to help the identification of inconsistent annotations.
We have formalized UD guidelines 2.0 by hand and we implemented our own transformation.
Dissertation available at https://www.cos.ufrj.br/uploadfile/publicacao/2858.pdf
Maybe @GPPassos can add something here too
Thanks a lot, that's very helpful! I've looked at the dissertation but didn't find a link to the .owl
file that's reproduced at the end of the PDF... I tried http://www.semanticweb.org/gppassos/ unsuccessfully, do you know if it's available somewhere?
To answer some of the questions: I'm not currently using this in production, I'm exploring the idea to convert some custom annotation format using UD (it's a very simple system based on character coordinates) into proper web annotations.
I want to import some UD-tagged corpus using web annotations I could define the UD URIs we need but I'm wondering if there is a project to have an RDF export of UD2? There is an RDF version of the V1 here (if you click on the small logos on the right). I'm quite used to RDF so I could help but I don't have much (time) resources at the moment.