Stvad / CrowdAnki

Plugin for Anki SRS designed to facilitate cooperation on creation of notes and decks.
MIT License
520 stars 44 forks source link

Export single files per card instead of one huge .json file to allow for better merging? #163

Open EtzBetz opened 2 years ago

EtzBetz commented 2 years ago

Hey there!

First I have to say I'm already pretty impressed by how nice CrowdAnki works. The only thing I would like to have, is, that the Plugin should export every card in it's own .json file. Why? Because that way you can fight merge issues way more easily.

Is there a specific reason why it wasn't made this way? The only problem I can imagine is, that sychronizing deleted cards would be troublesome eventually, since you'd either have to detect that a card was deleted and delete them one by one, or by deleting the whole card export folder in one step and then re-exporting all existing cards. that way, all removed cards would be removed and cards which still exist, would be re-added anyway.

This would eventually also solve #36 ?

Greetings, EtzBetz

aplaice commented 2 years ago

Thanks for the suggestion! That'd be an interesting approach. (I assume that each file would be named after the note's uuid? The deck/subdeck that the note belongs to could be stored as an extra field in the JSON, so all the current information would still be available. The note models etc. would also be separate JSON files, also named according to UUIDs.)

Some additional disadvantages that I can see:

  1. (Less relevant now that it's mostly expected that the decks and notes are stored in git repositories.) If the given deck didn't have any media, it was slightly easier to pass around a single JSON file (though the collection of JSON files could still be zipped).

  2. Editing a large collection of JSON files might be annoying. When it's a single, huge JSON file, it's still annoying, but if you want to find a given note you just search in that file; when it's a large number of JSON files, you need to first find the correct file (e.g. with grep), which is slightly less convenient.

  3. Reading a large number of JSON files might be less performant than reading a single large JSON file.

I'm not sure if we'll have a good opportunity to revisit the question of whether to split the JSON per note (e.g. due to an overhaul of the schema), but I'll keep it in mid.

Hopefully, with fixed sorting within the JSON, merge issues should occur less frequently, even with a single JSON file.

Stvad commented 2 years ago

38 (or in general considering additional srializaition formats) is also potentially a bit relevant to the discussion

also look into https://github.com/ohare93/brain-brew which allows you to convert the JSON into other potentially nicer formats

EtzBetz commented 2 years ago

I assume that each file would be named after the note's uuid?

That would be the simplest form probably, yes.

The note models etc. would also be separate JSON files, also named according to UUIDs

This I'm not even bothered about, to be honest. But since we're at it, it would probably make sense to also separate them, yes. Otherwise other users will come in at some point and suggest the same for note models.

1. If the given deck didn't have any media, it was slightly easier to pass around a single JSON file[.]

Yes, I kind of get that. But you also already described a solution to it, which isn't bothersome. And yeah, maybe I'm too old for that and/or too much into IT, but who shares files so manually these days (and is then using Anki AND CrowdAnki)?

2. Editing a large collection of JSON files might be annoying.

I get that as well, but wouldn't you want to edit the decks still in Anki? That's at least my work flow as of now? If it's a real issue, the only solution I get to my mind now, would be to change the naming scheme to be something like "_"?

3. Reading a large number of JSON files might be less performant than reading a single large JSON file.

Here you are very right, I'm just too new to Anki as that I would know if there are some really big sized decks? Would've to make sure that the export doesn't stall at least. But for default-sized decks I imagine this shouldn't be an issue?

I'm not sure if we'll have a good opportunity to revisit the question of whether to split the JSON per note (e.g. due to an overhaul of the schema), but I'll keep it in mid.

Thanks for this nice answer! Usually feature-requests such as this aren't answered that politely. I would also kind of like to contribute to a solution, I just don't know my way here in Anki Plugins etc., maybe I'll take a look when I have the time in a few weeks, or you have one or two pointers for me?

Greetings, EtzBetz

EtzBetz commented 2 years ago

38 (or in general considering additional srializaition formats) is also potentially a bit relevant to the discussion

Oh, yeah. But would there be formats which would have actual issues outside from being slower when exporting multiple files? I also thought about suggesting this multi-file export to be optional, but that would probably be even more work and more maintenance..

also look into https://github.com/ohare93/brain-brew

Oh, thanks! It looks like that's the tool I was actually thinking about writing myself, if it does actually export CrowdAnki into multiple files.

Greetings

aplaice commented 2 years ago

And yeah, maybe I'm too old for that and/or too much into IT, but who shares files so manually these days (and is then using Anki AND CrowdAnki)?

Yeah, true.

I get that as well, but wouldn't you want to edit the decks still in Anki? That's at least my work flow as of now?

Most of the time, yes, the "expected" workflow is to edit in Anki, but it's sometimes convenient to be able to edit elsewhere.

Here you are very right, I'm just too new to Anki as that I would know if there are some really big sized decks? Would've to make sure that the export doesn't stall at least. But for default-sized decks I imagine this shouldn't be an issue?

For "normal-sized" decks (≲1000 notes) it shouldn't be an issue. However, there are some huge shared decks; also some people use CrowdAnki to have a convenient human-readable back-up of their decks (with git history) and they (rightly) complain that their snapshots are taking very long, already. (I don't know whether splitting the JSON file into per-note pieces will actually, significantly degrade performance further — that would have to be explicitly checked once the "good opportunity to revisit the question" (adding an option for YAML output, like mentioned by Stvad above, might also be such an opportunity).

maybe I'll take a look when I have the time in a few weeks, or you have one or two pointers for me?

No strong recommendations. The official add-on docs are quite good. I recommend looking at other people's addons or just making some modifictions that would be useful for you.

I also thought about suggesting this multi-file export to be optional, but that would probably be even more work and more maintenance..

Some more maintenance and testing, yes.

Oh, thanks! It looks like that's the tool I was actually thinking about writing myself, if it does actually export CrowdAnki into multiple files.

BrainBrew is great, as it allows converting between CrowdAnki's output and several different formats (e.g. CSV files). I don't think it has an option to convert CrowdAnki's JSON into a collection of individual JSON files, yet, though. (It should be relatively easy to implement, though.)