Earlopain / FoxTrove

E6 Upload Helper
GNU General Public License v3.0
6 stars 3 forks source link

BlueSky submission IDs #128

Closed faucetlol closed 3 days ago

faucetlol commented 3 days ago

The cid is currently being used as the submission identifier:

https://github.com/Earlopain/FoxTrove/blob/6531db788b1f785391f7e3eada15ecf814548769/app/logical/scraper/bluesky.rb#L24-L30

But this ID isn't the one used for URLs at all - resulting in a URL that doesn't actually take you anywhere, e.g. https://bsky.app/profile/mayrin.bsky.social/post/bafyreiard66l5ghdzwrklacpjhosuwbiltpikdylqn2cp2bkmmmydqyoru.

I swapped it out for s.identifier = submission["uri"].split("/").last which actually provides the ID we need for the URL, providing a working link: https://bsky.app/profile/mayrin.bsky.social/post/3k75hbrv4gy2l.

The problem is, I'm not sure if these IDs are universally unique among all users? Maybe with that number of characters in the ID we can just trust there won't be any conflict among the small subset of users we'll be scraping, but it still doesn't feel ideal.

Earlopain commented 3 days ago

This seems fine. They don't have to be unique across all users since the url is scoped by the individual account, right? Do you want to open a PR?

faucetlol commented 3 days ago

Sure, I'll go ahead and open a PR then. I thought there may have been a problem not downloading a new submission if the submission ID was already in use.

Earlopain commented 3 days ago

There's a constraint on artist_url, identifier but nothing more so that's all good. I guess I would also expect them to be unique across each site but there's no real reason to enforce that