kensanata / mastodon-archive

Archive your statuses, favorites and media using the Mastodon API (i.e. login required)
https://alexschroeder.ch/software/Mastodon_Archive
GNU General Public License v3.0
358 stars 33 forks source link

Archiving context #110

Open timmc opened 2 weeks ago

timmc commented 2 weeks ago

There are two closely related features I would be interested in:

Use-cases:

Known complications:

Is this a feature set you'd be open to including?

kensanata commented 2 weeks ago

I’m afraid that this strays in to creepy hostile archiving territory, so I think I’d want to say “no”. See also #107.

kensanata commented 2 weeks ago

On the old software wiki that is no longer up, somebody bob wrote:

If I reply to someone for the first time in a thread, their message does not mention me. So the archive only has my post and not the parent. Archiving mentions only captures replies to me. If the conversation builds from there, the subsequent messages are archived because they mention me. But the head is missing. There are a lot of cases where I reply to someone and the context is lost because the server admin deletes all externally authored old messages. So it’s somewhat important to have a copy of what I replied to.

Kind of along the same lines, it would also be useful to store links to external copies of archived messages. E.g. if my admin cleans up and deletes all locally cached external statuses, I don’t just want a copy of the text but it would be useful to visit the message on the external server to see if a more complete copy of the thread is maintained there.

I said at the time: I'm sort of loath to add this feature because I don't want to turn mastodon-archive into a general status archiver. I find the toots that mention you are already not the ones you wrote, so in terms of copyright and moral rights, it seems that making and keeping copies without the author's consent is already shady – to then go up the chain and pull even more statuses with an even more tenuous connection seems to stray even further in that territory. I don't want to do this.

bob had a longer reply discussing the ethics:

Mentions: If Bob mentions Alice, Bob owns the copyright but the msg was directed to Alice & intended for her to receive the msg. Like traditional correspondence I think authors never have an expectation that the recipient will destroy the msg. From a GDPR standpoint, the msg is often about Alice thus she has a right to make a GDPR access request to get that data. OTOH, Bob also has a GDPR right to be forgotten although with microblogging he probably doesn’t assume she kept a copy. Strictly speaking, the GDPR does not apply whatsoever because the people are anonymous. But it’s still perhaps useful to think of situations in the spirit of the GDPR. In principle it might be most fair to sync with the server & if Bob deletes a msg then all copies should be deleted too. OTOH, adding a purge nonexisting command would add complexity to the code & perhaps be heavy on bandwidth & server load.

Favourites: we often favourite posts that mention us as an ack signal. So it’s essentially the same as the rationale for mentions. But sometimes we fav things that don’t mention us. Since Alice fav'ing Bob’s msg sends a signal that Alice liked it, we could perhaps rationalize it due to the signal that Alice liking it is about Alice thus she should have access to it.

Boosts: hmm… we are not archiving boosts, are we? Boosting something in itself is a copyright violation AFAICT, no? OTOH, I think if an author doesn’t want their work replicated (reblogged) they set the visibility to private. So maybe it’s fair to say public and unlisted msgs are implicitly authorized by authors for copying since they know boosting is possible. This same rationale could perhaps apply to all other archive triggers/collections.

Bookmarks: these are maybe a little sketchier than favs because you don’t send a signal that you liked that post, so a bookmark is in no way about yourself. Bookmarks are probably even sketchier than msgs that you replied to.

Parents msgs that don’t mention you: When you reply to someone, pre-microblog days we would quote the other person & quotes are “fair use” under the fair use doctrine because you are publishing commentary & have a right to quote portions that your work refers to. It only gets legally dicey if you were to quote someone’s whole book and then just give a one-liner comment. We omit the quotes only because it’s too bulky for microblogging. Consider what happens when Alice replies to Bob then Bob deletes his msg. He has a right to delete his msg, but then Alice’s post looks out of context & maybe even foolish. One consideration is that Bob’s erasure diminished Alice’s derivitive work. Alice has rights too and on non-microblog mediums the quote would persist even when the original is gone. Sometimes Bob’s erasure of his own msg is purely a deliberate act of sabotage against Alice. He would have kept his msg but sacrificed it to orphan Alice’s work. Of course Bob’s erasure can just as well be legit with no intent to undermine Alice.

I’m not a lawyer.. just trying to shed light on factors to consider.

Another thing to consider is if someone uses public visibility & also hashtags something, then they show some intent to have the stored and searchable perhaps long-term.

Fair points all around! And indeed, all of this is a quagmire: keeping copies of bookmarks, keeping copies of favourites. Syncing with the originals might be one way out. Anonymizing toots where the original is gone might be another. But I suspect there is no easy, automated way out of the quagmire. All I know is that it is tricky and I want less of it – and I definitely don't want more of it.

But yeah, now that you've listed all the problems with archiving, I'm ready to hand the project over to anybody who wants to take it.

timmc commented 2 weeks ago

Thanks for the reply. I totally understand if you don't want to support it for whatever reason. A few additional thoughts, though:

But yeah, now that you've listed all the problems with archiving, I'm ready to hand the project over to anybody who wants to take it.

Heh, sorry to have contributed to any additional maintainer-stress. I think it's fine for you to just draw a line and say "not interested, feel free to fork" if there's a feature you just don't want to add for any reason.

(I might make my own fork that archives context, but I'd be very unlikely to implement #107 -- if only just because that seems like scope creep, besides any of the social reasons.)