Closed ShreyanJain9 closed 1 year ago
Dumping some stuff here. You probably are already aware of this, but maybe it can help you a bit. To make links and mentions work you need to add facets
to the record.
facets = detect_facets(text)
input = Bskyrb::ComAtprotoRepoCreaterecord::CreateRecord::Input.from_hash({
"collection" => "app.bsky.feed.post",
"$type" => "app.bsky.feed.post",
"repo" => session.did,
"record" => {
"$type" => "app.bsky.feed.post",
"createdAt" => DateTime.now.iso8601(3),
"text" => text,
"facets" => facets
}
})
When looking at the official @atproto/api package you can find such a function here: https://github.com/bluesky-social/atproto/blob/e7a0d27f1fef15d68a04be81cec449bfe3b1011f/packages/api/src/rich-text/detection.ts#L7
Here is a quick and dirty translation from Typescript to Ruby using ChatGPT. The regular expressions don't seem to be working well, so that needs some work. There are probably better solutions, but am currently lacking the time to help out on that.
def detect_facets(text)
facets = []
utf8_text = text.encode('UTF-8')
# mentions
re = /(^|\s|\()(@)([a-zA-Z0-9.-]+)(\b)/
text.scan(re) do |match|
mention = match.join("")
next if !valid_domain?(mention) && !mention.end_with?('.test')
start = utf8_text.index(mention) - 1
facets.push({
'$type' => 'app.bsky.richtext.facet',
'index' => {
'byteStart' => start,
'byteEnd' => start + mention.length + 1,
},
'features' => [
{
'$type' => 'app.bsky.richtext.facet#mention',
'did' => mention
},
],
})
end
# links
re = /(^|\s|\()((https?:\/\/[\S]+)|((?<domain>[a-z][a-z0-9]*(\.[a-z0-9]+)+)[\S]*))/i
text.scan(re) do |match|
uri = match[2]
if !uri.start_with?('http')
domain = match[4] # this assumes the 'domain' group is the fifth match
next if !domain || !valid_domain?(domain)
uri = "https://#{uri}"
end
start = utf8_text.index(match[2], match.begin(0))
index = { 'start' => start, 'end' => start + match[2].length }
# strip ending punctuation
if uri.match(/[.,;!?]$/)
uri = uri[0..-2]
index['end'] -= 1
end
if uri.match(/[)]$/) && !uri.include?('(')
uri = uri[0..-2]
index['end'] -= 1
end
facets.push({
'index' => {
'byteStart' => index['start'],
'byteEnd' => index['end'],
},
'features' => [
{
'$type' => 'app.bsky.richtext.facet#link',
'uri' => uri,
},
],
})
end
facets.empty? ? nil : facets
end
def valid_domain?(str)
tlds = ['com', 'org', 'net', 'io', 'gov', 'edu'] # Define your TLDs here
tlds.any? do |tld|
i = str.rindex(tld)
i != -1 && str[i - 1] == '.' && i == str.length - tld.length
end
end
Edit: here is a quick untested attempt to clean the above up a bit. Disclaimer: I haven't really tested it fully. The mention seems to work ok. The link matcher still has some issues due to the URI.regexp
filtering out slashes somehow.
require 'uri'
def create_facets(text)
facets = []
# Regex patterns
mention_pattern = /(^|\s|\()(@)([a-zA-Z0-9.-]+)(\b)/
link_pattern = URI.regexp
# Find mentions
text.enum_for(:scan, mention_pattern).each do |m|
index_start = Regexp.last_match.offset(0).first
index_end = Regexp.last_match.offset(0).last - 1
facets.push(
'$type' => 'app.bsky.richtext.facet',
'index' => {
'byteStart' => index_start,
'byteEnd' => index_end,
},
'features' => [
{
'$type' => 'app.bsky.richtext.facet#mention',
'did' => m.join("").strip # this is the matched mention
},
],
)
end
# Find links
text.enum_for(:scan, link_pattern).each do |m|
index_start = Regexp.last_match.offset(0).first
index_end = Regexp.last_match.offset(0).last - 1
facets.push(
'$type' => 'app.bsky.richtext.facet',
'index' => {
'byteStart' => index_start,
'byteEnd' => index_end,
},
'features' => [
{
'$type' => 'app.bsky.richtext.facet#link',
'url' => m.join("").strip # this is the matched link
},
],
)
end
facets.empty? ? nil : facets
end
Nice! Thank you 🙂
Yeah, it's mostly the facet detection that's been giving me issues. I'll adapt what you've written, most likely 🙂
Thanks for this 🙂 Your thing helped me get to a final solution!
Just leaving this up here. I'll probably try to figure this out tomorrow.