Closed faucetlol closed 3 days ago
This seems fine. They don't have to be unique across all users since the url is scoped by the individual account, right? Do you want to open a PR?
Sure, I'll go ahead and open a PR then. I thought there may have been a problem not downloading a new submission if the submission ID was already in use.
There's a constraint on artist_url, identifier
but nothing more so that's all good. I guess I would also expect them to be unique across each site but there's no real reason to enforce that
The cid is currently being used as the submission identifier:
https://github.com/Earlopain/FoxTrove/blob/6531db788b1f785391f7e3eada15ecf814548769/app/logical/scraper/bluesky.rb#L24-L30
But this ID isn't the one used for URLs at all - resulting in a URL that doesn't actually take you anywhere, e.g. https://bsky.app/profile/mayrin.bsky.social/post/bafyreiard66l5ghdzwrklacpjhosuwbiltpikdylqn2cp2bkmmmydqyoru.
I swapped it out for
s.identifier = submission["uri"].split("/").last
which actually provides the ID we need for the URL, providing a working link: https://bsky.app/profile/mayrin.bsky.social/post/3k75hbrv4gy2l.The problem is, I'm not sure if these IDs are universally unique among all users? Maybe with that number of characters in the ID we can just trust there won't be any conflict among the small subset of users we'll be scraping, but it still doesn't feel ideal.