snarfed / granary

💬 The social web translator
https://granary.io
Creative Commons Zero v1.0 Universal
432 stars 57 forks source link

Bluesky.from_as1: convert links, @-mentions, and hashtags to facets #675

Open snarfed opened 6 months ago

snarfed commented 6 months ago

We should support facet output in from_as1! Specifically, convert mention and hashtag tags to app.bsky.richtext.facet#mentions and #tags, respectively, and HTML links in content to #links.

This will be a bit tricky, since we don't currently generate indices for those tags in eg microformats.json_to_object and Source.postprocess_object, and we don't generate anything at all for plain links yet. We'd want to look into how AS1 tag indices work anyway, eg do they always index into plain text content, eg content.value?

cc @JoelOtter

JoelOtter commented 6 months ago

Possibly not the place for this(?) but we also need to think about link cards on Bluesky as those aren't implicitly created like they are on e.g. Mastodon. Should it be the first or last link in a post, by default? My gut would be last based on how I usually format posts

snarfed commented 6 months ago

@JoelOtter Ooh good point! Last link sounds fine to me, but I wonder if there's a more native mf2 way to do it. Will ask on #microformats.

snarfed commented 6 months ago

From https://github.com/snarfed/bridgy/issues/1661#issuecomment-1932287001 :

Apart from @-mentions specifically, I'll also echo here what I mentioned in https://github.com/snarfed/granary/issues/675 in general : this is going to be difficult to implement. We have to "disassemble" span-based HTML markup into Bluesky index-based facets. For arbitrary content HTML, we have to parse it, extract just the tags we care about (currently links and microformats2 hashtag u-categories), discard other tags (including overlapping ones like the span here), extract the plain text, calculate the start and end indices of the tags we care about into the plain text, convert those indices to bytes in the HTML document's character encoding, and populate all of that into Bluesky facets. Phew.

@kevinmarks sent the old XOXO parser as an example that does this, which is great! Still though. This feels like a nontrivial project.

snarfed commented 6 months ago

@JoelOtter we discussed link preview cards briefly on #microformats and ended up with the proposal that users could use u-featured to indicate the link(s) to preview: https://indieweb.org/link-preview#which_link_to_preview . If none of the links have that, we could default to first, or whatever.

JoelOtter commented 6 months ago

Makes good sense to me. I think I'd still opt to make it the last one but as long as it's documented and configurable either should be good!

JoelOtter commented 6 months ago

I guess the other question is do we still include the URL in the post itself or do we remove it in favour of just the link card? Bluesky lets you do this

snarfed commented 6 months ago

Oh sorry, yes, definitely last!

Whether to remove the link or not, good question. Maybe yes when it's the very last text in content, and we generate a preview for it, otherwise no?

snarfed commented 4 months ago

Got hashtags working! https://bsky.app/profile/snarfed.bsky.social/post/3kp3bfhk25d2c

JoelOtter commented 4 months ago

I fear this has broken Bluesky publish - it seems to be creating facets for all tags, not just hashtags, which may not be present in the text. Example for this post:

https://www.joelotter.com/notes/2024/04/05-japan1/

https://brid.gy/log?module=default&start_time=1712287175&key=agdicmlkLWd5clkLEg1QdWJsaXNoZWRQYWdlIjJodHRwczovL3d3dy5qb2Vsb3R0ZXIuY29tL25vdGVzLzIwMjQvMDQvMDUtamFwYW4xLwwLEgdQdWJsaXNoGICA-O_I7JYLDA

snarfed commented 4 months ago

Oh no! Sorry about that, you're right. Will fix.

snarfed commented 4 months ago

OK @JoelOtter that should be fixed, feel free to try again.

JoelOtter commented 4 months ago

Works great, thank you!

On Sat, 6 Apr 2024 at 06:00, Ryan Barrett @.***> wrote:

OK @JoelOtter https://github.com/JoelOtter that should be fixed, feel free to try again.

— Reply to this email directly, view it on GitHub https://github.com/snarfed/granary/issues/675#issuecomment-2040618853, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAVDELHYRU2B6ST5XCYQU23Y34GHTAVCNFSM6AAAAABC2LXFLWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBQGYYTQOBVGM . You are receiving this because you were mentioned.Message ID: @.***>

snarfed commented 4 months ago

Got mentions working! https://bsky.app/profile/snarfed.bsky.social/post/3kpgoiehgt32y

snarfed commented 4 months ago

@JoelOtter feel free to try hashtags with p-category, and @-mentions with any link to a bsky.app user with link text starting with @, if you want!

snarfed commented 1 month ago

The one remaining bit here is to convert arbitrary HTML links, ie <a href> tags, to Bluesky facets.