hypothesis / h

Annotate with anyone, anywhere.
https://hypothes.is/
BSD 2-Clause "Simplified" License
2.93k stars 427 forks source link

Forbes metadata parsing can't discover relation between print and normal view #1760

Closed csillag closed 9 years ago

csillag commented 9 years ago

Looking at the source of http://www.forbes.com/sites/robertbradley/2014/12/01/doubling-down-on-climate-alarmism/ , I see this in the header: <link rel="shortlink" href="http://onforb.es/12gVToW" />

The print view or the article is here: http://www.forbes.com/sites/robertbradley/2014/12/01/doubling-down-on-climate-alarmism/print/ The header has the same tag: <link rel="shortlink" href="http://onforb.es/12gVToW" />

However, as far as I can tell, we are not loading annotations made on one page for the other. (Why is that? Is shortlink not among the list of data that we can use for discovering identity?)

gergely-ujvari commented 9 years ago

Sometimes, when I load the extension on the print page, I got this exception:

 Error in setupAnnotation for undefined : Error: No mappings found for [1739:1795]!
    at DomTextMapper.window.DomTextMapper.DomTextMapper.getMappingsForCharRange (chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/public/scripts/hypothesis.min.js?7886a6ce:17:12949)
    at TextPositionAnchor._createHighlight (chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/public/scripts/hypothesis.min.js?7886a6ce:20:1076)
    at TextPositionAnchor.<anonymous> (chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/public/scripts/hypothesis.min.js?7886a6ce:11:18922)
    at TextPositionAnchor.Anchor.realize (chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/public/scripts/hypothesis.min.js?7886a6ce:11:18964)
    at TextPositionAnchor.realize (chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/public/scripts/hypothesis.min.js?7886a6ce:11:591)
    at Host.Annotator.setupAnnotation (chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/public/scripts/hypothesis.min.js?7886a6ce:12:302)
    at loader (chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/public/scripts/hypothesis.min.js?7886a6ce:12:1193)
    at Host.Annotator.loadAnnotations (chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/public/scripts/hypothesis.min.js?7886a6ce:12:1554)
    at Object.loadAnnotations (chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/public/scripts/hypothesis.min.js?7886a6ce:13:4424)
    at Object.onMessage [as handler] (chrome-extension://bjfhmglciegochdpefhhlphglcehbmek/public/scripts/hypothesis.min.js?7886a6ce:9:7232)

So this can be the cause too.

csillag commented 9 years ago

This ticket is about not being able to discovering document equivalency, and thus not loading the right set of annotations.

The comment above is about a different problem (which might or might not show that even the print page is not "static enough"); I will look into that separately.

gergely-ujvari commented 9 years ago

Actually, if I open your print link, it just redirects me to the normal page, and there I got the exception. Okay, D-T-M can come later.

Otherwise, yes, our document plugin is not collecting the url.

Document data for an annotation created in the print page:

    "document": {
        "title": [
            "Doubling Down On Climate Alarmism"
        ], 
        "dc": {}, 
        "highwire": {}, 
        "eprints": {}, 
        "link": [
            {
                "href": "http://www.forbes.com/sites/robertbradley/2014/12/01/doubling-down-on-climate-alarmism/print/"
            }
        ], 
        "twitter": {
            "card": [
                "summary"
            ], 
            "site": [
                "@forbes"
            ]
        }, 
        "reply_to": [], 
        "facebook": {
            "title": [
                "Doubling Down On Climate Alarmism"
            ], 
            "description": [
                "This week, UN representatives are meeting in Peru in hopes of crafting an agreement that will limit global greenhouse emissions and combat the (nonexistent) problem of rising worldwide temperatures. This comes on the heels of the recent climate pact between the U.S. and China aimed to reduce CO2 emissions. While these events may be cause for celebration amongst green activists and the Obama Administration, for the average American, they are just two more misspent efforts by a closed-minded political elite in a world with real, here-and-now problems."
            ], 
            "type": [
                "article"
            ], 
            "site_name": [
                "Forbes"
            ], 
            "image": [
                "http://images.forbes.com/media/assets/forbes_1200x1200.jpg"
            ]
        }, 
        "prism": {}
    }, 
    "uri": "http://www.forbes.com/sites/robertbradley/2014/12/01/doubling-down-on-climate-alarmism/print/"

and for an annotation created not at the print page:

    "document": {
        "title": [
            "Doubling Down On Climate Alarmism"
        ], 
        "dc": {}, 
        "highwire": {}, 
        "eprints": {}, 
        "link": [
            {
                "href": "http://www.forbes.com/sites/robertbradley/2014/10/29/bad-science-worse-policy/"
            }, 
            {
                "rel": "canonical", 
                "type": "", 
                "href": "http://www.forbes.com/sites/robertbradley/2014/12/01/doubling-down-on-climate-alarmism/"
            }
        ], 
        "twitter": {
            "card": [
                "summary"
            ], 
            "site": [
                "@forbes"
            ]
        }, 
        "reply_to": [], 
        "facebook": {
            "title": [
                "Doubling Down On Climate Alarmism"
            ], 
            "image": [
                "http://images.forbes.com/media/assets/forbes_1200x1200.jpg"
            ], 
            "description": [
                "This week, UN representatives are meeting in Peru in hopes of crafting an agreement that will limit global greenhouse emissions and combat the (nonexistent) problem of rising worldwide temperatures. This comes on the heels of the recent climate pact between the U.S. and China aimed to reduce CO2 emissions. While these events may be cause for celebration amongst green activists and the Obama Administration, for the average American, they are just two more misspent efforts by a closed-minded political elite in a world with real, here-and-now problems."
            ], 
            "type": [
                "article"
            ], 
            "site_name": [
                "Forbes"
            ], 
            "url": [
                "http://www.forbes.com/sites/robertbradley/2014/12/01/doubling-down-on-climate-alarmism/"
            ]
        }, 
        "prism": {}
    }, 
    "uri": "http://www.forbes.com/sites/robertbradley/2014/12/01/doubling-down-on-climate-alarmism/"
csillag commented 9 years ago

Actually, if I open your print link, it just redirects me to the normal page, and there I got the exception.

Oh, I see. So, it looks like that in order to actually get to the print page, one has to click on the "print" link manually.

tilgovi commented 9 years ago

rel=shortlink is not a standard link relation. If we think it's likely to be useful, we can add support, but Forbes really should have a rel=canonical. Do they not?

csillag commented 9 years ago

Yes, they should, but currently, they don't. Regardless, I think recognizing "shortlink" will be useful.

Treora commented 9 years ago

I suppose they could be useful. Even our own website (Wordpress) uses rel='shortlink' links..

On 5 December 2014 at 01:44, Randall Leeds notifications@github.com wrote:

rel=shortlink is not a standard link relation. If we think it's likely to be useful, we can add support, but Forbes really should have a rel=canonical. Do they not?

— Reply to this email directly or view it on GitHub https://github.com/hypothesis/h/issues/1760#issuecomment-65724728.

tilgovi commented 9 years ago

Okay. Reading a bit I'm convinced this is a real thing. This belongs in the Document plugin and not in this issue tracker, though.

csillag commented 9 years ago

Great! The PR fixing this upstream (https://github.com/openannotation/annotator/pull/470) has been merged! Now we just need to port https://github.com/openannotation/annotator/commit/7adbdaf5eca56856225c22d0ea457f2139bdaacf to our own branch...

csillag commented 9 years ago

OK, the fixed document plugin is now on H master. (Thx to @gergely-ujvari.) Should work now.

However, in the meantime, another problem (https://github.com/openannotation/annotator-store/issues/110) was found, that might break this in some cases.

So it might not work properly in various situations... :(

gergely-ujvari commented 9 years ago

I've tested this locally and after creating a new annotation with the proper metadata this is working. So closing this for now.

Fixed by https://github.com/hypothesis/h/commit/4348451215c3dd8c8934555584e47f7c4ed2a807