bluesky-social / atproto

Social networking technology created by Bluesky
Other
7.29k stars 515 forks source link

Improper mention facet extraction (problem with byte offsets) #2823

Closed mfn closed 1 month ago

mfn commented 2 months ago

Describe the bug

As a consumer of the Bluesky API I came across a post with mention facets where my code wasn't highlighting the mention facets correctly.

After some analysis I realized that the information I received via the API must be wrong. Once I located the post itself on Bluesky and realized it's also rendered incorrectly, namely exactly as my code produced it.

To Reproduce

I do not have a reproducer, I did not write the post, I just tripped over it when consuming the api.

Here's the post https://bsky.app/profile/diegodeabreu.bsky.social/post/3l4jzk4mkzd2r

Screenshot: image

As can be seen, the mention highlights are wrong (this is consistent with what the API returns, though).

Expected behavior

Mention byte offsets should be corrct

Details

I don't have any of these details, it's the official server.

Additional context

Here is the payload of the API call I received when I requested that post

API response ``` { "posts": [ { "labels": [], "uri": "at://did:plc:pxbptk7tzl3szbgaxxg36rru/app.bsky.feed.post/3l4jzk4mkzd2r", "quoteCount": 0, "indexedAt": "2024-09-19T21:47:55.296Z", "replyCount": 1, "repostCount": 1, "likeCount": 9, "viewer": { "threadMuted": false, "embeddingDisabled": false }, "cid": "bafyreidhsova7ysf42s3jvr2byz3nzzo57ivh27vpkhzd2qvidpuare27u", "author": { "displayName": "Diego Feijó de Abreu ", "avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:pxbptk7tzl3szbgaxxg36rru/bafkreicmprenqtyhepzpsuv5uor4bdlol3fliiadhxeej6uy6vojfjb44e@jpeg", "associated": { "chat": { "allowIncoming": "all" } }, "viewer": { "muted": false, "blockedBy": false }, "labels": [], "createdAt": "2023-04-17T19:34:49.429Z", "did": "did:plc:pxbptk7tzl3szbgaxxg36rru", "handle": "diegodeabreu.bsky.social" }, "record": { "$type": "app.bsky.feed.post", "createdAt": "2024-09-19T21:47:55.296624Z", "facets": [ { "features": [ { "$type": "app.bsky.richtext.facet#mention", "did": "did:plc:nvfposmpmhegtyvhbs75s3pw" } ], "index": { "byteStart": 20, "byteEnd": 37 } }, { "features": [ { "did": "did:plc:s6j27rxb3ic2rxw73ixgqv2p", "$type": "app.bsky.richtext.facet#mention" } ], "index": { "byteEnd": 81, "byteStart": 60 } }, { "features": [ { "did": "did:plc:uewxgchsjy4kmtu7dcxa77us", "$type": "app.bsky.richtext.facet#mention" } ], "index": { "byteEnd": 118, "byteStart": 104 } }, { "index": { "byteEnd": 161, "byteStart": 141 }, "features": [ { "$type": "app.bsky.richtext.facet#mention", "did": "did:plc:mf5dzzqkp7fnmby6blfeljwj" } ] }, { "features": [ { "did": "did:plc:e62gb2ushvtvjvqcbrxeaw2n", "$type": "app.bsky.richtext.facet#mention" } ], "index": { "byteStart": 184, "byteEnd": 208 } }, { "features": [ { "$type": "app.bsky.richtext.facet#mention", "did": "did:plc:e72cwu7fen37hzzzhwy6mkxp" } ], "index": { "byteEnd": 257, "byteStart": 231 } } ], "reply": { "parent": { "cid": "bafyreid53re4egytyqqmvvcq4plfm4n5qeeb7th7cunmkfg2v5z5cttmae", "uri": "at://did:plc:pxbptk7tzl3szbgaxxg36rru/app.bsky.feed.post/3l4jzk2qju32g" }, "root": { "cid": "bafyreid53re4egytyqqmvvcq4plfm4n5qeeb7th7cunmkfg2v5z5cttmae", "uri": "at://did:plc:pxbptk7tzl3szbgaxxg36rru/app.bsky.feed.post/3l4jzk2qju32g" } }, "text": "Posição: 1º podcast @jamellebouie.net \nPosição: 2º podcast @kenwhite.bsky.social \nPosição: 3º podcast @bloomberg.com \nPosição: 4º podcast @junlper.bsky.social \nPosição: 5º podcast @chrislhayes.bsky.social \nPosição: 6º podcast @hausofdecline.bsky.social" } } ] } ```
bnewbold commented 2 months ago

It looks to me like the posts you have linked to (by @diegodeabreu.bsky.social) are test posts by a developer who is learning how to implement the facet system. The the bsky posts Lexicons, it is up to the client developer creating the posts to correctly generate facets (or use an SDK/helper which implements this for them).

If that is correct, this issue should be file/reported to that developer themselves, right?

mfn commented 2 months ago

Interesting approach to leave this up to the client and leave the possible interpretation of broken data to everyone else.

I've nothing more to add, I'm not the creator, just a consumer of the api.

Seems to me this is all by design and there's nothing actionable.

If so, I guess this issue can be closed?

mary-ext commented 1 month ago

Yes, issue can be closed, there isn't really anything to fix here (unfortunately/fortunately)