Closed timjogorman closed 8 years ago
I'd imagine that we'd want to keep the NEs mentioned in "the Findlaw page", "from Wikipedia", "CNN" etc. I'd personally want this kind of mention to look like:
(p2 / page :poss (c2 / company :name (n / name :op1 "Findlaw")) :mod (t2 / this) :location (u / url-entity :value "http://caselaw.findlaw.com/us-supreme-court/307/174.html")))
I think I'd prefer :source
instead of :poss
for the name of the site/content provider.
Thanks, Tim, good point. When I checked the SemEval AMRs earlier this week, the need for expanding the guidelines in this respect became clear as well. And I think that your proposals generally reflect what most annotators have been doing.
At a more detailed level, regarding question 1, I have seen cases such as
(p / publication :wiki "CNN" :name (n / name :op1 "CNN")
:ARG1-of (l / link-01
:ARG2 (u / url-entity :value "http://www.cnn.com")))
so, basically, link-01
instead of :mod
.
Regarding question 2, I wonder whether we might not want to drop deictic terms such as "this" and "here". Example: The report can be found <a href="...">here</a>.
(p / possible-01
:ARG1 (f / find-01
:ARG1 (r / report)
:location (u / url-entity :value "https://www.amnestyusa.org/sites/default/files/air12-report-english.pdf")))
-- Ulf
Decision at AMR phone meeting on Dec. 7, 2015:
Hi all! In looking at the IAA diffs, I noticed that we're all doing "link text" -- urls that are replaced with text -- differently, and wanted to get us on the same page (It would be good to get examples for each of these issues into the guidelines, so that we are more consistent about it.)
Question 1:
How do we treat links with article titles as the link text?
PolitiFact | The Obameter: Create 5 million "green" jobs
Lowe's pulls advertising from TLC's 'All-American Muslim' - CNN.com
Missing Guns Raise Eyebrows over U.S. Arms Dealings Abroad
We've done these as: flat named articles:
as normal AMRs with url-entity tag:
as a normal AMR with no url-entity:
Proposed:
The second option -- as a normal AMR with a url-entity tag, and with ":mod" used to link to the url-entity -- seems like the best to me, since we seem to have traditionally just parsed headlines as normal text. Does that sound like a good treatment?
Question 2:
How do we deal with link text when it's a description of the address -- particularly when it contains information like the website or host publication?
I'm assuming that for link text like "here" or "in the link", we could replace it with the url-entity directly. The question is for issues like:
This case can be read at this Findlaw page
I haven't found anything yet, but I came across this interesting information in Wikipedia:
Proposed:
I'd imagine that we'd want to keep the NEs mentioned in "the Findlaw page", "from Wikipedia", "CNN" etc. I'd personally want this kind of mention to look like:
Any opinions on those?