otiai10 / opengraph

Open Graph Parser for Go
https://pkg.go.dev/github.com/otiai10/opengraph
MIT License
65 stars 13 forks source link

Fix duplicate media when both root and `*:url` are present #29

Closed kainosnoema closed 2 years ago

kainosnoema commented 2 years ago

Some sites define both og:image,audio,video as well as the og:*:url property. In that case we need to skip adding the media again.

codecov-commenter commented 2 years ago

Codecov Report

Merging #29 (10561b3) into main (4b9a7c7) will increase coverage by 0.11%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #29      +/-   ##
==========================================
+ Coverage   93.67%   93.78%   +0.11%     
==========================================
  Files           3        3              
  Lines         158      161       +3     
==========================================
+ Hits          148      151       +3     
  Misses          6        6              
  Partials        4        4              
Impacted Files Coverage Δ
meta.go 93.75% <100.00%> (+0.30%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 4b9a7c7...10561b3. Read the comment docs.

kainosnoema commented 2 years ago

What I understand from the spec is that og:video:url describes the url property of one piece of media, just like the other media sub properties. In that case the way it was parsing before was incorrect and this fixes that.

In fact according to the spec, what Facebook did (I found this on Facebook’s own property) is completely correct:

The og:image property has some optional structured properties: og:image:url - Identical to og:image.

So the url property when identical to the root should be treated as the same piece of media.

On Tue, Oct 19 2021 at 01:53, Hiromu OCHIAI < @.*** > wrote:

@.**** commented on this pull request.

Thank you!

The point I need to make sure is: Do we really need to skip it if duplicated or should we collect everything as described on the target?

Let me check some documents and make my own opinion on that.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub ( https://github.com/otiai10/opengraph/pull/29#pullrequestreview-782979041 ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AABDUH4FBDT5Q6F5X7QICLDUHUWZZANCNFSM5GHKOIZA ). Triage notifications on the go with GitHub Mobile for iOS ( https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 ) or Android ( https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub ).

otiai10 commented 2 years ago

Thanks @kainosnoema ! super helpful to understand.