nyu-mll / jiant-v1-legacy

The jiant toolkit for general-purpose text understanding models
MIT License
21 stars 9 forks source link

[CLOSED] spr&srl data #1090

Closed jeswan closed 4 years ago

jeswan commented 4 years ago

Issue by lovodkin93 Saturday May 16, 2020 at 19:36 GMT Originally opened as https://github.com/nyu-mll/jiant/issues/1090


hello, I am trying to analyze the SPR1 data and the SRL, using the scores resulting from running the analyze_runs.py. I'm in particular intereseted in the distance between span1 and span2. Yet, when analyzing the scores, something doesn't make sense to me: from what i know about the SPR task and the SRL task, we have span1 and span2 and a connection between them labeled by "label". But when I analyze the dataframe received from running the analyze_runs.py script, there seems to be only one span_distance (with different distance values, but no distinction between span1 and span2). Is this span distance the distance between span1 and span2? if so, is it between the minimal values of each span, the maximal values of each span or between something else? and if not, is there a way to add to the dataframe distances between the two spans? Thank you very much. PS, I added a printscreen of the dataframe with the span_distance I was talking about (I added that of the SPR analysis, but it's the same for the SRL one).

image

jeswan commented 4 years ago

Comment by iftenney Tuesday May 19, 2020 at 18:13 GMT


If you're using analysis.py or something that calls it, span_distance is computed here: https://github.com/nyu-mll/jiant/blob/44c6780738be1eee9868d35a0f2f96f42ba71aa7/probing/analysis.py#L425

As implemented, it should be equal to the number of tokens between the two spans, i.e. if you have [foo] bar [baz] span_distance would be 1.

jeswan commented 4 years ago

Comment by lovodkin93 Friday May 22, 2020 at 11:01 GMT


If you're using analysis.py or something that calls it, span_distance is computed here:

https://github.com/nyu-mll/jiant/blob/44c6780738be1eee9868d35a0f2f96f42ba71aa7/probing/analysis.py#L425

As implemented, it should be equal to the number of tokens between the two spans, i.e. if you have [foo] bar [baz] span_distance would be 1.

great, thank you very much!