Closed ArneBinder closed 2 months ago
Thanks a lot for the update and the detailed instructions! I tested this code on the predictions from xlm-roberta-large
(based on this model, seed 1) and the code above generates correctly looking nodesets.
There were only two (and a half) issues:
1) test_map2.json
could not be processed because of the following error:
ValueError: Expected all roles of n-ary relation s_nodes:Default Rephrase to be prefixed with s_nodes:, got ya_s2ta_nodes:source.
I found the following annotation in nary_relations: {"arguments": [3935837313197648457, 488896451309244507], "roles": ["ya_s2ta_nodes:source", "ya_s2ta_nodes:target"], "label": "s_nodes:Default Rephrase", "score": 0.28063133358955383, "_id": 1239322064654169658}
2) I also visualized generated nodesets to see if they look fine and, in general, they do. However, in test_map0.json
we have a YA node (the one on the top) that does not connect TA-node to any other node and I am not sure whether it should be there:
3) We also have some rev-relations in the output (e.g., test_map8.json), I suppose, we should re-reverse them?
thanks for testing that out! regarding
NONE
for an S-node, but not for its anchor YA-node. fixed in https://github.com/ArneBinder/dialam-2024-shared-task/pull/31/commits/3852b51b7efbece95c37769dc5ce2bbb45077b1a-rev
labeled relations, this just needs to be done EDIT: fixed in https://github.com/ArneBinder/dialam-2024-shared-task/pull/31/commits/054da1658aa88feadc2040e3facc14b0f6ef13a7Since our output was approved by the organizers for the nodesets from sample_test, this is finally ready.
This PR implements the following methods:
unmerge_relations()
: convertTextDocumentWithLabeledEntitiesAndNaryRelations
back toSimplifiedDialAM2024Document
, i.e. inverse ofmerge_relations()
convert_to_example()
: convertSimplifiedDialAM2024Document
to shared task data format, optionally considering predicted relationsThe predicted output can be loaded via
Then, the documents can be converted:
And save to file:
Unfortunately, this requires some more metadata (original
nodes
,edges
, andlocations
) which was previously not correctly added to the document, so it is necessary to create the predictions with this PR branch to get the conversion correctly working.~NOTE: THIS IS NOT YET FULLY TESTED!~