laurent22 / joplin

Joplin - the privacy-focused note taking app with sync capabilities for Windows, macOS, Linux, Android and iOS.
https://joplinapp.org
Other
45.54k stars 4.95k forks source link

All my tags are duplicated for some reason but this is only visible in the data API #8242

Closed nickhobbs94 closed 1 year ago

nickhobbs94 commented 1 year ago

I am developing a plugin for Joplin that automatically adds tags based on certain rules. I ran into an issue where my code seemed to work perfectly in the development environment but not on my actual notebook.

It turns out that almost all of my tags have duplicates. So my plugin code which makes the assumption that tag titles are unique is wrong.

Here's some of the output of joplin.data.get(['tags']).

85: {id: '878ce433a6cb483ca17681a85d59e511', parent_id: '', title: 'language learning'}
86: {id: 'ec5965dc80944342bb8079de49d97c96', parent_id: '', title: 'language learning'}
87: {id: '3c0470466b4c43bfaac3b133b3c90ae7', parent_id: '', title: 'platonic solids'}
88: {id: '95351b8970854dddbb610def9cad9428', parent_id: '', title: 'platonic solids'}
89: {id: '53e47f7494df43c5ad1f7074d2659a8a', parent_id: '', title: 'graph theory'}
90: {id: '2a29b0053b234672b33ad50cd4b4da17', parent_id: '', title: 'project euler'}
91: {id: '62c781ee0a4d458a99ba4dfbd2d69258', parent_id: '', title: 'project euler'}
92: {id: '3b12b5e76c8148a497e25cb39a396e83', parent_id: '', title: 'mathematics'}
93: {id: 'ca1cd417e1a0455c83aea1d3f6eeb9a1', parent_id: '', title: 'mathematics'}
94: {id: '3c8c953a8a934e25b7870bea9930eb49', parent_id: '', title: 'mandelbrot set'}
95: {id: 'c16c218eff7d43778ab41d8c2da3df39', parent_id: '', title: 'mandelbrot set'}
96: {id: '20ee1d75ac43482eb30b79a9503d29d2', parent_id: '', title: 'shaders'}
97: {id: '95455423726341f691ebb3af0785fa4a', parent_id: '', title: 'shaders'}

As you can see, most are duplicates. graph theory is the only tag in this section without a duplicate.

I'm wondering if there's a nice way of recovering from this state. I'm likely to write some plugin code to recover from this myself, but I imagine this would break a good number of plugins out there and I'm thinking it'd be nice to have some sort of automatic cleanup of duplicate tags or tags with an empty title (something I've noticed is possible to create on the dev environment if you mess up the API calls).

Environment

Joplin version: 2.10.19 Platform: macOS OS specifics: darwin, M1 mac

Steps to reproduce

Unfortunately I don't really know how I got in this state. I have migrated between several instances of Joplin and sync methods over the years so it's likely to have been in one of those migrations or a buggy old version of Joplin.

Describe what you expected to happen

I expected some sort of cleanup to happen periodically so that this kind of thing gets recovered from nicely.

nickhobbs94 commented 1 year ago

Looking at a couple of these tags. It seems that several duplicates don't have any notes attached to one of the pair. It seems like we could get 90% of the way there if we can delete tags without any notes attached.

Daeraxa commented 1 year ago

I don't know for certain but I think these would be cleaned up when the note history period passes?

nickhobbs94 commented 1 year ago

Sorry I don't really know what that is. Most of these notes are very old (like at least a year). I'm guessing this period would've passed for most of these notes.

Daeraxa commented 1 year ago

You can check by seeing what your note history period is. If there are any revisions at all that reference those tags then they will stay.

nickhobbs94 commented 1 year ago

Oh of course! Thank you that makes sense. So my note history I set to 90 days. If I look at the tag opc I've got 3 notes:

But I've got two tags for opc

48: {id: 'b5b12229adc848ebb3c41bf75de834bb', parent_id: '', title: 'opc'}
49: {id: 'c563d60989b04217b102e88866fa2fdd', parent_id: '', title: 'opc'}

And if I run

await joplin.data.get(['tags', 'b5b12229adc848ebb3c41bf75de834bb', 'notes'])
 > {items: Array(3), has_more: false}
await joplin.data.get(['tags', 'c563d60989b04217b102e88866fa2fdd', 'notes'])
 > {items: Array(0), has_more: false}

So they're all attached to the b5... tag and don't appear to have any history.

Daeraxa commented 1 year ago

In which case I guess the note revision service, rightly or wrongly, doesn't pick these up for removal. Not sure which is the correct behaviour.

nickhobbs94 commented 1 year ago

I guess I kind of understand why orphaned tags hang around. Say I create a new tag with:

await joplin.data.post(['tags'], null, {title: 'recipe'});

That tag exists now and is not attached to any notes. This obviously has to stay around for some period of time because I want to keep track of the id to add it to a note.

But I do want unattached tags to eventually be deleted right? Maybe there's no periodic cleanup at all. Tags might just get deleted when they get removed from the last note.

nickhobbs94 commented 1 year ago

I'm doubtful that the duplication arose from the revision service. I think it was likely just an old bug maybe on import or swapping sync backends. The two opc tags were created at the same time.

{id: 'b5b12229adc848ebb3c41bf75de834bb', title: 'opc', created_time: 1666671938466, updated_time: 1666671938466, user_updated_time: 1666671938466, …}

{id: 'c563d60989b04217b102e88866fa2fdd', title: 'opc', created_time: 1666671938466, updated_time: 1666671938466, user_updated_time: 1666671938466, …}

Which interestingly is Tue 25 Oct 2022 15:25:38 AEDT much later than any note says it was last updated. But still outside my 90 day revision history.

I think I likely swapped sync backends around this time (Joplin Cloud from Nextcloud). I think I vaguely remember some issues in the transition. I may have restored from a backup. I wonder if I can recreate this by creating a tagged note, backing up, deleting the notebook, and then importing from the backup.

nickhobbs94 commented 1 year ago

No luck on reproducing the issue. It appears that when you import a JEX it searches for tags that already exist with the same name. I wonder if it was always like this however.

Daeraxa commented 1 year ago

Weird, I was going to suggest maybe it was from a bad import or due to sync conflicts.

dpoulton-github commented 1 year ago

Tags might just get deleted when they get removed from the last note.

I had a vague memory of the subject of old, unused tags never being deleted coming up in the forum. I found this post from JUL19 where it is stated.

In any case, there’s no way to delete tags in Joplin, you can only unassociate them from all notes in which case they no longer show up in the UI. Afaik ‘deleting’ tags was never implemented.

So, at least historically, Joplin did not delete unused tags. This could explain the duplicates if it has not always reused tags present in the database but not assigned.

github-actions[bot] commented 1 year ago

Hey there, it looks like there has been no activity on this issue recently. Has the issue been fixed, or does it still require the community's attention? If you require support or are requesting an enhancement or feature then please create a topic on the Joplin forum. This issue may be closed if no further activity occurs. You may comment on the issue and I will leave it open. Thank you for your contributions.

github-actions[bot] commented 1 year ago

Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, feel free to create a new issue with up-to-date information.

dryya commented 1 year ago

I wonder if this should be a broader issue about the inability to delete tags? I accidentally created a tag with a typo and now seem to have no way to get rid of it. There are a few old issues about this but they were all closed by the stale bot:

and... this issue got closed by the stale bot literally as I was typing this up