ankitects / anki

Anki's shared backend and web components, and the Qt frontend
https://apps.ankiweb.net
Other
18.89k stars 2.14k forks source link

Import/Export todos #1895

Open RumovZ opened 2 years ago

RumovZ commented 2 years ago

With basically all of the old code ported to Rust, it's time to turn to new features and improvements. This issues is supposed to track such todos, which have previously been scattered across various PRs and issues. Some discussion is on #1018 and this list will be updated according to the progression over there.

General

Apkg

JSON

More notes on https://github.com/ankitects/anki/pull/1850#issuecomment-1122156181.

CSV

From https://forums.ankiweb.net/t/import-problems-suggestions/5856. Might also apply to JSON.

dae commented 2 years ago

Import deck options without scheduling information.

Not sure about this one. The styling/JS deck authors add to their shared decks are already a big source of issues, and allowing shared deck authors to push their option choices on users is likely to lead to confusion, and make support harder.

dae commented 2 years ago

Optionally export notetype and deck columns. (Let user decide if by name or id?)

Could we get away with just using a name?

RumovZ commented 2 years ago

allowing shared deck authors to push their option choices on users is likely to lead to confusion, and make support harder.

That's from https://forums.ankiweb.net/t/stats-ruined-when-importing-decks-from-others/15901/3 which you've linked in #1018 (although I falsely wrote "import" instead of "export"). I don't mind scratching that. But to pick up another thought from there: We could give users the option to import without scheduling. This should be quite simple to implement.

Could we get away with just using a name?

Probably a good compromise. Changing deck and notetype names should be rare enough to not cause too much inconvenience with exported CSVs.

dae commented 2 years ago

We could give users the option to import without scheduling.

If we're presenting a dialog for other reasons as well, that's certainly a possibility. I suspect the fact that the new exporting code excludes scheduling by default will dramatically reduce the number of users accidentally importing scheduling in the future, but such an option could still potentially be useful for older exports.

RumovZ commented 2 years ago

I had in mind that there already is a check for microsecond ids in the db check routine or something, but I can't find it. Am I misremembering? Would we just check if the timestamp is larger than today, maybe with a tolerance of 1 day to accomodate for timezones? Most Anki ids are based on timestamps, but I think only note, card and revlog ids are ever converted to dates. Do we still need to check other ones? Should we let the user decide whether to skip or repair affected notes or just decide for them and report later? I wonder if implicitly repairing might not trip up third-party tools that expect certain ids to exist after the import.

dae commented 2 years ago

There's one for due numbers over a million, and a few caps applied in .sql files to ensure certain values fit in an i32 or u16. I'd also be wary about trying to automatically fix the problem due to the potential problems it could cause; I'd just been thinking we could detect such a case when importing, and present an error to the user like "This deck contains timestamps in the future. Please contact the deck author and ask them to fix the issue using the ... add-on."

cards/notes/revlog would be the big ones; we can always add others in the future if it still proves to be an issue

njam commented 1 year ago

Small note about the check making sure IDs are not in the future (added in https://github.com/ankitects/anki/pull/1928):

This deck contains timestamps in the future. Please contact the deck author and ask them to fix the issue.

Some anki deck generator libraries are generating IDs based on the current timestamp, then adding one millisecond for each new ID (e.g. Rust genanki-rs, Python genanki). Obviously this will produce IDs in the future, and thus the import will fail, albeit not for too long.

I really don't know what is a good way to solve this, so I'm adding a comment here hoping somebody else knows more. I guess the root problem is that "id" must be a timestamp and must be unique, which is probably not easy to resolve :D

RumovZ commented 1 year ago

IIRC the main purpose was to reject microsecond timestamps, which caused issues when interpreted by Anki as ms timestamps. If it makes life easier for third-party libraries, there shouldn't be a problem with accepting timestamps from the near future (like 24h), or is there, @dae?

dae commented 1 year ago

Sounds reasonable, I'll push that through.

RumovZ commented 1 year ago

Optionally limiting the duplication check to the current deck is a common feature request. Should I add it to the list, @dae?

dae commented 1 year ago

If it can be done without a lot of code or a big slowdown, sounds good.

youdontneedtoknow22 commented 1 year ago

This is probably off-topic and could be deleted, but I just wanted to get your attention on this bug (https://www.reddit.com/r/Anki/comments/zrlvwm/this_deck_contains_timestamps_in_the_future/) and the solution in the comments worked for me. I was exporting a really known Anki Deck in Germany for medical students (Zankiphil) using version 2.1.60

dae commented 1 year ago

This is covered on https://faqs.ankiweb.net/timestamps-in-the-future.html. Please use the forums if you have further questions.