drahnr / cargo-spellcheck

Checks all your documentation for spelling and grammar mistakes with hunspell and a nlprule based checker for grammar
Apache License 2.0
314 stars 32 forks source link

fix: remove suggestions' duplication #286

Closed granddaifuku closed 11 months ago

granddaifuku commented 1 year ago

What does this PR accomplish?

Closes #10 .

Changes proposed by this PR:

To remove the duplicated suggestions, the Vector of Suggestion is converted once to a HashSet and then again to a Vector.

Notes to reviewer:

This PR tries to resolve the 🔰 part of the issue, but I'm willing to work on further parts of it.

📜 Checklist

drahnr commented 1 year ago

This practically solves the problem of identical suggestions, but how about partial overlaps? Do you have an idea (theoretically) how to deal with multiple overlapping suggestions, at least avoiding corruption?

granddaifuku commented 1 year ago

Thank you for your reviews!

I'm still striving to find a better solution for the case there are multiple mistakes.

Could I make sure my understanding is that

multiple mistakes that are overlapping and/or with different fix suggestions.

means Where checkers point out are same, but the way they suggest is different?

If so, I think we could achieve this by iterating through both resulted collections and merging the overlapped ones based on the fields of Suggestion: origin, span, and range. (Here, in this case, we don't need to use HashSet.) I'll appreciate it if you could give me feedback on this!

drahnr commented 1 year ago

The difficulty is overlapping changes that are confliciting, or even if they're not, how to merge them.

My idea would be, to omit one of them (the latter one), and re-run on the relevant chunk after applying the first suggestion.

granddaifuku commented 1 year ago

Thank you for your help!

It makes sense!!

So, in that case, we need to identify overlapping changes, then apply the suggestion internally on the firstly overlapped one, and re-run the checker on the latter ones. Does this match what you are thinking?

drahnr commented 1 year ago

Yes, that's precisely it. Sorting might help and then iterating over adjacent tuples, just an idea. I don't have a vision on how the checker would be re-run, though. In a perfect world, without any additional IO.

granddaifuku commented 1 year ago

Great thanks!

Maybe it will take a few days, but I will give it a try.

granddaifuku commented 1 year ago

Sorry for the slow progress. I have just implemented a method to determine the overlappings. After this, I will implement the method to determine and process the checker's overlappings.

drahnr commented 1 year ago

Hey @granddaifuku 👋 - gentle ping in the new year 🎆 - is there anything you're blocked on I could help with?

granddaifuku commented 1 year ago

Happy new year @drahnr ! Thank you for caring about this!!

I'm sorry I couldn't make any progress on this PR due to my thesis. My thesis defense is at the beginning of February, so I can restart working on this around the mid of February.

drahnr commented 1 year ago

Best of luck (which I am sure you don't need) @granddaifuku ! Feel free to ping me once you're done :)

drahnr commented 1 year ago

Gentle ping, now that people are starting to hit issues with this i.e. #295

granddaifuku commented 1 year ago

@drahnr Thank you for reaching out to me multiple times.
I would like you to know that my thesis defense is now over, so I will restart working on this PR this Sunday!

granddaifuku commented 1 year ago

@drahnr I made some updates that detect overlapped suggestions. However, I still could not find a way to re-run the checker after internally applying the first suggestion. Could you help me with that point?

drahnr commented 1 year ago

Hey @granddaifuku I'll try to find some time later this week :) for a hunch, see #296

granddaifuku commented 1 year ago

@drahnr Thank you for your reviews! I've addressed the stylistic point and will work on the second one as well. I will let you know once I've made the changes, so please review them when you have a chance. Thank you again.

granddaifuku commented 1 year ago

@drahnr Great, thank you for your review! I have made a change to remove any overlapping suggestions from the suggestions vector. I will also be making a change to the logic to be more accurate in detecting overlap and will update you once that is done.

granddaifuku commented 1 year ago

@drahnr I wanted to let you know that I made some changes to the is_overlapped method to check for overlaps in multi-line suggestions. Whenever you have some free time, I'd appreciate it if you could take a look and review the changes I made. If everything seems to be in order, I would like to add tests for the updated functionality.

drahnr commented 1 year ago

@granddaifuku sorry for the long silence, is there anything left to do from your perspective? Did you test it as part of the dummy projects in the working tree?

granddaifuku commented 1 year ago

@drahnr Thank your for your response, and I'm sorry for not ping you again for a long time. I already checked it with a demo project, and it worked. I will make this PR ready for a review then.