Closed seanlynch closed 3 years ago
Yes, absolutely.
3accf4d adds a split command. Sadly, the remaining commands don't know how to handled a split archive. Currently, searches will only work for the current data file; HTML exports will only be created for the current data file, and so on. Ideally, these commands would open one data file after another and do their work.
Could we have a "join" command then ? This would allow to run a command into the whole archive if needed.
Shouldn’t we use an option instead that controls whether older files are read for all the commands? Or perhaps we need to think about the commands we would like to run? Perhaps a simple search-all-files is enough.
I think the --combine
option introduced in #63 closes the problem. If anybody disagrees, feel free to reopen.
As far as I can tell, the database will just grow without bounds, even if one expires old posts. This is what I want, since I am using mastodon-archive to archive old posts before expiring them. But the file will become unwieldy, and I imagine the program will eventually start running out of memory.
I'm not sure what the best approach is. I'm thinking that what I'd like is for the monotonically growing bits like statuses, favorites, mentions, and media to be split up somehow by date, with periodic snapshots of the other data. Since the data is already in a pretty simple format, this is already pretty easily doable with external tools, but it seems like the sort of thing that should be built into the software itself.