Closed calzada closed 3 years ago
It seems like the example is missing from your question. If the question was "how to create IDs of segments", I've done this now in 6d9d23e (should've done it before!). If the question was "how are IDs of u element created" well, they just go one after the other, e.g.
<u xml:id="ParlaMint-ES_2020-12-15-CD201215.u1" ...>
...
<u xml:id="ParlaMint-ES_2020-12-15-CD201215.u2" ...>
but that much is obvious by just looking at some file.
Pls. note that we are not quite finished with the corpus, there are still some things to make better. But I didn't find the time yet, as I has to look the corpora for lots of other languages. Of course, they can already set up the annotation pipeline, and it should be simple to run it again, once we have the base corpus ready.
Excellent, I will circulate your answer and do not worry. We will wait as much as you need since we realize you are extremely busy. I am so embarrassed that I am giving you so much work that I am trying to lift of the load. Please, let me know if you need anything from us. Have a nice weekend, great Tomaz https://www.youtube.com/watch?v=92wf6LM8wh8
Best for now, mc
Dear Tomaz, Another email bu Luciana:
Hi Tomaž,
I actually meant the id that every seg has. For example,
I noticed that the script needs one parent ID to create the following word IDs so that we can display the dependency relations https://github.com/clarin-eric/ParlaMint/blob/a1110008eae5bc837d111bf46aa405671948fd13/ParlaMint-PL/ParlaMint-PL_2015-11-12-senat-01-1.ana.xml#L1779.
I'd like to know how this id="segXXXXXX" is created.
I intend to use an adaptation of https://github.com/clarin-eric/ParlaMint/blob/a1110008eae5bc837d111bf46aa405671948fd13/Scripts/classlisize.py to create the other child ids. For instance, , , and so on.
Thank you so much again, Luciana.
Best mc
I'd like to know how this id="segXXXXXX" is created.
Well, it doesn't really matter, as long as the seg id is unique in the corpus. But, for ES, I just appended .$n to the u ID: https://github.com/calzada/PARLAMINT-ES-MC/blob/6d9d23e8d1fc88cfc3e8db9065954c1b6e19e7cc/ParlaMint/ParlaMint-ES_2015-01-20-CD150120.xml#L101
@calzada, if you have XML tags in the text of the issue, you need to put them in backticks (inverted apostrophes), like <u>
; if you just write the tag, as in , strange things happen, and I then don't see the examples. Have a look at the MarkDown guide, Inline code.
OK. Tomaz. Noted. And thanks for your help.
Best for now, mc
And this has been settled as well.
Dear Tomaz, I have asked for help from two very talented NLP experts. Luciana asks me this:
"How are the seg id parents created (as in)? I've been trying to use an adapted version of classlisize.py for Spanish, and I noticed it requires a parent ID, are they created randomly, or are listed somewhere?
Sorry if this info is already presented in a documentation."
If you answer, whenever you can, I will re-direct this information or I can also give both of the rights to work here. Best for now, mc