K1rakishou / Kuroba-Experimental

Free and open source image board browser
GNU General Public License v3.0
683 stars 40 forks source link

Data is not shared between normal threads and their archived versions #415

Open Efruit opened 3 years ago

Efruit commented 3 years ago

If you post in a thread, then open it in an external archive, none of the metadata (Unread posts, your posts, etc.) from the original thread is persisted. Wouldn't it make sense to consider archived threads to be the same entity as the original?

(Similarly, if a bookmarked thread is deleted, it should automatically fall back to an archived version instead of throwing an error, but that might be worth a second issue.)

K1rakishou commented 3 years ago

Yes, it's not shared because the replies belong to different sites. I guess I could automark the same posts in the archived threads when you open it.

(Similarly, if a bookmarked thread is deleted, it should automatically fall back to an archived version instead of throwing an error, but that might be worth a second issue.)

This is most likely a bug and yes it should go into a separate issue.

K1rakishou commented 3 years ago

So I just checked and it suggest you to open it in an archive so that's pretty much what you want. It doesn't redirect you to an archive automatically because there may be multiple archives supporting this site/board so I decided to just ask the user every time which one to use.

image

Efruit commented 3 years ago

So I just checked and it suggest you to open it in an archive so that's pretty much what you want.

Not really, because it loses data. If a thread gets expunged, I'd like to be able to pick up where I left off. If only one archive is enabled for a board, why not automatically switch the thread to an archive? Or if there are multiple, why not just pick one? Or merge results from all of them? In theory, the archives' contents will be identical.

K1rakishou commented 3 years ago

If a thread gets expunged, I'd like to be able to pick up where I left off.

You probably want thread downloading but it's not implemented yet.

If only one archive is enabled for a board, why not automatically switch the thread to an archive?

This probably should be implemented. I will extract it into a separate issue.

Or if there are multiple, why not just pick one?

How do I know which you prefer? Some archives may even be banned in your country (But I guess if it's banned then you should disable it so it won't get selected).

Or merge results from all of them?

This one is too complex. And with ghost posts (which will be supported one day) they will be constantly overwritten by each archive.

Efruit commented 3 years ago

You probably want thread downloading but it's not implemented yet.

Ideally yes, but even then it should be able to fetch missing posts from to fill in the gaps left by deleted posts, or posts that were made while the device was unable to fetch them.

How do I know which you prefer?

You could let the user pick the preferred archive per-board.

This one is too complex. And with ghost posts (which will be supported one day) they will be constantly overwritten by each archive.

I disagree. I've only lightly browsed the codebase, but it appears that posts effectively "belong" to a thread, which "belongs" to a board, which "belongs" to a site. The site could be an archive or the original source. An example of that hierarchy for a non-archived post would be:

Post Thread Board Site
#12345 #10000 /g/ 4channel.org
Now if it were opened in the archive, it would be: Post Thread Board Site
#12345 #10000 /g/ rbt.asia
If my understanding is correct, it would make more sense to have the post include the physical location, and have the site describe the original source. The relationship would then be similar to: Post (Source) Thread Board Site
#12345 (4channel.org) #10000 /g/ 4chan.org

If the archive version were fetched, and more posts were discovered (i.e. ghost posts, or if the thread had been deleted while the device was unable to update)

Post (Source) Thread Board Site
#12345 (4channel.org) #10000 /g/ 4chan.org
#12346 (rbt.asia) #10000 /g/ 4chan.org
#12368 (rbt.asia) #10000 /g/ 4chan.org

That would allow you to merge the data from archives and original sites in to a single, unified resource.

K1rakishou commented 3 years ago

That would allow you to merge the data from archives and original sites in to a single, unified resource.

I already had such archive system, before the very first KurobaEx release (it was implemented sometime around june/july of 2020), that would automatically merge archive posts with normal posts to add back deleted posts (and images). You could select archives from which the posts would be fetched, there was logic to determine the best fitting archive (like one archive supports media and the other doesn't then the former one would be chosen). There was logic to switch from one archive to another automatically in case an archive died for any reason or if one archive doesn't have a thread archived for some reason but the other one has it. But it was complicated as hell and in the end I got rid of it and never looked back. In the end I decided to remove it and instead use a simple redirection to an archived thread when requested. Now with thread previewing it's even more convenient, pretty much the same as when using 4chanx. And it's way simpler than the original implementation. So I'm not planning to go this route again, I've already tried it and it wasn't good.

By the way in your example you forgot about the ghost posts. Consider this:

Post (Source) Thread Board Site Comment
#12345 (4channel.org) #10000 /g/ 4chan.org comment 1
#12346 (rbt.asia) #10000 /g/ 4chan.org comment 2
#12388,1 (rbt.asia) #10000 /g/ 4chan.org ghost comment (rbt)
#12388,1 (wakarimasen.moe) #10000 /g/ 4chan.org ghost comment (wakarimasen)

Now there are two ghost posts (12388,1 and 12388,1) in thread 10000 each in separate archive (since ghost post is basically a post that was made directly on an archive, 4chan doesn't even know anything about it).

Now trying to merge 12388,1 from rbt.asia with 12388,1 from wakarimasen.moe will create a conflict since they have the same postId (12388,1). So one will overwrite another one.