Mike7154 / DupCatch

This tool is built to find duplicates in anki cards that are not identified by the built in Anki 'find duplicates' function
GNU General Public License v3.0
2 stars 1 forks source link
anki anki-flashcards duplicate-detection python

DupCatch: The Anki Duplicates Finder

Contributors Forks Stargazers Issues MIT License LinkedIn

DupCatch V2.0.2.beta

The Anki Duplicates Finder
Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

Description

I built to identify duplicate notes in Anki. The built in Anki tool only works if the field is identical. This tool will calculate a similarity score between each note-note pair and rank the most similar notes to identify the non-identical duplicates

There are two functions.

  1. Find Duplicates
    • This will tag all of the likely duplicate pairs (for review) sorted by most similar based on the algorithm
  2. Merge
    • You can use this tool to merge duplicate fields or tags.

      Built With

Installation

  1. Clone or download and unzip the repo
    git clone https://github.com/Mike7154/DupCatch.git
  2. Install dependencies (must have python installed and mapped)
    pip install -r requirements.txt

    or

    py -m pip install -r requirements.txt
  3. You can verify the installation by running:
    cd path/to/DupCatch
    py dupcatch.py

    or

    cd path/to/DupCatch
    python dupatch.py

(back to top)

Usage

See more detailed instructions at https://1drv.ms/w/s!Ar3STOvKhP6Ymts6IuxRUntBSumx5A?e=h9scEH

  1. Copy the desired Anki package file (.apkg, .colpkg) to 'Dupcatch/anki_collection'

  2. Modify the settings.yml (copycv settings_template.yml file if settings.yml doesn't yet exist)

    • If you are doing a 'Duplicates' run, at least modify the 'Duplicates' section in settings.yml
    • If you are doing a 'Merge' run, at least modify the 'Merge' section in settings.yml
  3. Run the Script

    cd path/to/DupCatch
    python -m pip install -r requirements.txt
    python dupcatch.py -r
    • For a Merge run use python dupcatch.py -m
  4. The tool will output a new *.apkg file into DupCatch/anki_collection which will include only notes that were modified

  5. Review the results in Anki (I recommend using the Special Fields Addon to choose whether you want tags or a full import)

  6. Tag the notes to merge fields, tags, or mark as 'not duplicate' or 'covered_by'

    • only the merge tags are directional, and you only tag the receiving note.

(back to top)

Roadmap

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

GNU General Public License. See LICENSE.txt for more information. You can copy, modify, and distribute this software as you please. If you want to use this tool in proprietary software, contact me and I will send a private license agreement.

(back to top)

Contact

Your Name - @twitter_handle - m6611022@gmail.com

Project Link: https://github.com/Mike7154/DupCatch

(back to top)

Acknowledgments

(back to top)