Closed tanikina closed 2 months ago
Remaining warnings regarding TA
-loops:
nodeset_id=18321: Detected loop nodes: {'543218', '543226', '543222'}
nodeset_id=18795: Detected loop nodes: {'566024'}
nodeset_id=18874: Detected loop nodes: {'571641', '571658', '571679', '571673', '571714', '571683', '571653', '571646', '571688', '571663', '571668', '571636', '571631', '571629'}
nodeset_id=18877: Detected loop nodes: {'571917', '571930', '571923'}
nodeset_id=19173: Detected loop nodes: {'595568', '595573'}
nodeset_id=19174: Detected loop nodes: {'595636', '595696', '595648', '595640'}
nodeset_id=19773: Detected loop nodes: {'633816'}
nodeset_id=19897: Detected loop nodes: {'641279', '641301', '641289', '641269', '641257', '641317', '641250'}
nodeset_id=19918: Detected loop nodes: {'642982', '642991', '642987'}
nodeset_id=20729: Detected loop nodes: {'674593', '674586', '674582'}
nodeset_id=20894: Detected loop nodes: {'685146'}
nodeset_id=21022: Detected loop nodes: {'690585'}
nodeset_id=21023: Detected loop nodes: {'692003', '691998', '691979', '691976', '691991'}
nodeset_id=21039: Detected loop nodes: {'693222'}
nodeset_id=21275: Detected loop nodes: {'704348'}
nodeset_id=21279: Detected loop nodes: {'704689'}
nodeset_id=23120: Detected loop nodes: {'665292', '665296'}
nodeset_id=23144: Detected loop nodes: {'693005', '693036', '693021', '692990', '692968', '692947'}
nodeset_id=23391: Detected loop nodes: {'839308'}
nodeset_id=23479: Detected loop nodes: {'794844'}
nodeset_id=23517: Detected loop nodes: {'775891', '776007', '798681'}
nodeset_id=23533: Detected loop nodes: {'799381'}
nodeset_id=23551: Detected loop nodes: {'801243'}
nodeset_id=23552: Detected loop nodes: {'767719', '801330', '801302'}
nodeset_id=23560: Detected loop nodes: {'802482', '802475'}
nodeset_id=23599: Detected loop nodes: {'806770'}
nodeset_id=23688: Detected loop nodes: {'860200'}
nodeset_id=23696: Detected loop nodes: {'817050'}
nodeset_id=23789: Detected loop nodes: {'820709'}
nodeset_id=23799: Detected loop nodes: {'823663', '823659'}
nodeset_id=23809: Detected loop nodes: {'824652'}
nodeset_id=23837: Detected loop nodes: {'827426'}
nodeset_id=23849: Detected loop nodes: {'828443'}
nodeset_id=23853: Detected loop nodes: {'828982'}
nodeset_id=23878: Detected loop nodes: {'831523'}
nodeset_id=23892: Detected loop nodes: {'832595', '832589'}
nodeset_id=23959: Detected loop nodes: {'840574', '840569', '840579'}
nodeset_id=25511: Detected loop nodes: {'605130', '605135'}
nodeset_id=25526: Detected loop nodes: {'601506'}
nodeset_id=25528: Detected loop nodes: {'600819'}
nodeset_id=25691: Detected loop nodes: {'1027474', '1027462', '1027457', '775927', '775933', '1027467'}
nodeset_id=25723: Detected loop nodes: {'816292'}
This adds a check whether any pair of nodes has several paths connecting them (i.e., A -> ... -> A loops). For example, see the following loop with multiple L-nodes in nodeset 18321:
Because of such loops when we do DFS we never add such nodes to the stack (because of the check for unvisited children here) and, as a result, these nodes are missing in the final nodeset after calling
sort_nodes_by_hierarchy()
. This version collects all such nodes and adds them to the stack here, so that after we process all the leaves we can also process such cases.Output before fix:
EDIT: we get 34 failed nodesets if we don't check for any loops at all. If we check for self loops like here then we get 19 failed nodesets (and 1381 processed nodesets in total).
Output after fix:
Now also
visualize_arg_map.py
displays the nodeset with loops correctly. Below is an example for nodeset 25511.Visualization before fix:![nodeset25511 gv](https://github.com/ArneBinder/dialam-2024-shared-task/assets/9082878/436475db-a954-458e-b5eb-8e0486e04cec)
$ python src/visualization/visualize_arg_map.py data/train/ data/visualizations 25511
Visualization after fix:![nodeset25511 gv](https://github.com/ArneBinder/dialam-2024-shared-task/assets/9082878/27c9ed8b-c942-425f-8372-e93942e09a40)
Note that the size of the training set has changed from 1381 to 1400. The blacklist and the test in
tests/dataset_builders/pie/test_dialam2024.py
have been updated accordingly.