Closed akosbalasko closed 1 year ago
Thanks! Really looking forward to this one. Happy to serve as a tester.
Hi @matboehmer ! I've created a pre-release with this Levensthein-distance linking feature, feel free to download from here https://github.com/akosbalasko/yarle/releases/tag/v5.8.0 and test it. Thanks a lot!
Thanks, great! How can I run the code using npx or any other way? I am not sure if npx -p yarle-evernote-to-md@5.8.0 yarle --configFile config.json
uses the latest code.
@matboehmer yes yes, it should work as you wrote, just extend your config.json with a new property:
useLevenshteinForLinks: true
Thanks, got it! However, it does not work for me. It seems like applyLinks
in apply-links.js
is only called once and also the if (options.useLevenshteinForLinks)
block is only called once (I added a console output for debugging). However, in the test set I posted in #530 there are 4 links. So, from my understanding the levenshtein lookup should also be done 4 times?
hm... it is iterated through the recognized links and replaces the link URLs everywhere in the notes folder. Let me check.
It's hard to create a real test for multiple links, the Evernote fails to sync for me currently. So it will take a bit of time, sorry.
@matboehmer could you pls give it a try via the UI? thanks a lot!
Same result; also does not work using the app UI. Does it work for you? Do you have some test data you could share?
@matboehmer ,
Yes yes, here is the enex I use for testing: https://github.com/akosbalasko/yarle/blob/master/test/data/test-levenshtein-links.enex And here are the two notes created: https://github.com/akosbalasko/yarle/blob/master/test/data/test-levenshtein-linksNoteA.md https://github.com/akosbalasko/yarle/blob/master/test/data/test-levenshtein-linksNoteB.md
It works for me with your data set, but not with the one I postet here https://github.com/akosbalasko/yarle/issues/530#issuecomment-1728543525
I think that one, what you shared in the comment reflects a different issue which cannot be resolved easily. What i implemented is that if the the referenced note is recognized by its note text's shortest Levenshtein-distance. For instance if the text of the note is mistyped like notA is typed instead of noteA, and there is no notes that's name is more similar than this, then notA is going to be picked.
In my example data in https://github.com/akosbalasko/yarle/issues/530#issuecomment-1728543525 the wrong link is created as [[first-note|second note]]
in both files first-note
and second-note
. However, the link [[first-note|second note]]
could be fixed to [[second-note|second note]]
(i.e., replacing first-note
with second-note
) by looking up a proper link target using Levensthein distance.
@matboehmer , Okay, I found a bug around the unique id recognizer that caused that the links could overlap each other. Now it is fixed, I checked with your example, and as I see it fixes your issue, but please confirm. Thanks a lot!
Great, thank you! Works perfectly now on my test data set and already really good on my real data set. Thank you very much for adding this feature!
matboehmer's great idea is, in order to increase the number of the recognizable chains between links two end, that Yarle could try to do it by calculating a Levensthein distance between the text of the link and the existing notes' title created and apply the link to the minimal one.
If more than one notes has minimal distance, based on another setting Yarle could do the followings:
As an MVP I would implement case 3.