fonol / anki-search-inside-add-card

An add-on providing full-text-search and PDF reading functionality to Anki's Add card dialog
https://ankiweb.net/shared/info/1781298089
GNU Affero General Public License v3.0
177 stars 24 forks source link

Bulk add notes #191

Open aviral-batra opened 3 years ago

aviral-batra commented 3 years ago

Hi, amazing addon!

I was wondering if there was a way to bulk add notes? Context: I have a large obsidian vault and wanted to add the notes in my vault recursively to SIAC. In this use case It would also be great if there was a way to set bulk reviewer settings.

fonol commented 3 years ago

Hi, I could imagine a bulk import from a vault. One could import the files as a link to the original file (like if you click External File in the Create dialog), or put the file's content in the Text section of the note.

A potential issue I see is that if you import again at a later point, there is no easy way to detect if a renamed Obsidian note already exists as an add-on note or not. To my knowledge, obsidian names the .md exactly like the individual note's title. That would mean if you even slighty change the title of a note in Obsidian, the respective add-on note would point to an invalid path.

That said, I can totally see the value of such an import function, if you either seldom change existing obsidian notes, or you just use it as a one-time import. But if it was intended to keep a vault, where you keep adding and changing notes, and the add-on notes "in sync", I see problems arise.

aviral-batra commented 3 years ago

I think there is a solution to that - store an 'id' for the note in the YAML frontmatter. But I think it's not a good idea as it will likely involve writing to the files.

I think to combat the issue of not being in sync, the addon could show you a list of files for which the path no longer exists/have moved around and you could change the path for/add/delete those you have deleted for those few manually.

Also, when I have used obsidian, I have never seen this title=filename functionality. Do you mean that the first header gets put as the filename? Because if so, I don't think that happens (but I may be wrong). Maybe it's a setting.

I think regardless, the point you made in your last paragraph means that even if you cannot 'sync' the notes, the majority of the hard work is done for you and it is far less painstaking to add new notes. As I mentioned above, the add on could still point out to you if you have changed the path of any notes.

N.B. here the drawback would be that you can't tell when the person has brought in a new file and named it exactly the same as the old one and put it in the exact same folder, but I think that seldom occurs.

fonol commented 3 years ago

Also, when I have used obsidian, I have never seen this title=filename functionality. Do you mean that the first header gets put as the filename? Because if so, I don't think that happens (but I may be wrong). Maybe it's a setting.

I am not really an Obsidian expert, but on my local installation, this seems to be how it works out of the box.

image

I think regardless, the point you made in your last paragraph means that even if you cannot 'sync' the notes, the majority of the hard work is done for you and it is far less painstaking to add new notes. As I mentioned above, the add on could still point out to you if you have changed the path of.

Totally agree. Now that I think about it, I would really much like that feature realized a general mass import function. E.g. you point it to a folder, give a list of file extensions, and optionally a priority and tags.

aviral-batra commented 3 years ago

Hmm - I think Untitled 1 - which is at the top of the page - is the file name, not the 'header' or 'title'. If you change it at the top, I think the file name (and hence the path) will change too.

Now that I think about it, I would really much like that feature realized a general mass import function. E.g. you point it to a folder, give a list of file extensions, and optionally a priority and tags.

++ Maybe also review settings e.g. if you wanted to have it set to growing Ivl, but otherwise that sounds amazing!

aviral-batra commented 3 years ago

On another note: If you wanted to integrate more with obsidian, the YAML might be useful for specifying custom settings and you could perhaps have a column in the table to contain the markdown style links in the file.

Right now I use the notion python API to recursively scan all my markdown notes and create relational databases between the note types. It creates a new table based on a 'type' in the YAML, then creates relations between the tables based on markdown links.

image

image

Similarly, note 'types' could be specified in the YAML alongside review settings, e.g. above I have last reviewed in the YAML.

This may be going out on a limb though :laughing:.

fonol commented 3 years ago

Hmm - I think Untitled 1 - which is at the top of the page - is the file name, not the 'header' or 'title'. If you change it at the top, I think the file name (and hence the path) will change too.

That was badly communicated on my side, that's what I meant with "title".

On another note: If you wanted to integrate more with obsidian, the YAML might be useful for specifying custom settings and you could perhaps have a column in the table to contain the markdown style links in the file.

Sorry, can't follow that one. Which YAML file do you mean?

Right now I use the notion python API to recursively scan all my markdown notes and create relational databases between the note types. It creates a new table based on a 'type' in the YAML, then creates relations between the tables based on markdown links.

image

image

Similarly, note 'types' could be specified in the YAML alongside review settings, e.g. above I have last reviewed in the YAML.

That is a really interesting approach, but currently a bit far out of my scope to be honest. I am personally not using Obsidian (but I know of at least two users that do), so my main motivation to do such a feature would be to have a general batch import (which I would use for PDFs mainly).

This may be going out on a limb though 😆.

Maybe, but there is no harm in phantasizing about possible features, in fact I like it very much to do exactly that :-).

aviral-batra commented 3 years ago

Sorry, can't follow that one. Which YAML file do you mean?

I meant the YAML frontmatter (YFM):

YFM is an optional section of valid YAML that is placed at the top of a page and is used for maintaining metadata for the page and its contents.

So not a separate file, just section of YAML syntax at the start of the markdown file, marked out by '---' above and below it. Sorry, I didn't make that clear.

I am personally not using Obsidian (but I know of at least two users that do), so my main motivation to do such a feature would be to have a general batch import (which I would use for PDFs mainly).

That's fair enough. To be honest, the batch import would satisfy my use case and I would probably turn to using the add on for all of my review organisation anyway and ditch notion - it would be easier to have everything in one place.

This addon is really multifaceted, so kudos to you!

p4nix commented 3 years ago

As an alternative, have you considered just using Obsidian to Anki ( https://xkcd.com/1205/ )? This way, you can use Anki cards as "entrypoints" into your collection and use the Rememorize add-on for scheduling (or just regular schedule with tweaked deck settings). The advantage is that the add-on auto-updates the file links, which is obviously extremely convenient.

image

Another hint: using Zettelkasten-type notenames (ID) and not ever changing it could be a workaround to changing filenames.

aviral-batra commented 3 years ago

I use obsidian to anki to make notes, but using it like that is an interesting thought.

I had not heard of the rememorize addon but I had a quick read. So what you are saying is that each file has it's own 'flashcard' and then using the rememorize addon I can schedule this card for me to review again on a certain date?

That is definitely a good idea. It would require some organising i.e. you would have to prevent this 'page' card from going into the same deck as the other cards on an obsidian to anki page with 'normal' cards, but it would work well once set up. Thanks for the tip! I'll definitely look into that.

I guess my use case aside, the bulk import might still be a good feature for SIAC because it offers pdf and video support, as well as the ability to make anki cards from pdfs. If you had a large directory of, then it would be nice to have them all in one place.

p4nix commented 3 years ago

For PDFs, this is why I wrote the Zotero importer at one point (and fonol kindly adapted it into the add-on at that time, as I didn't understand any of the qt code back then). But yes, some crude bulk import would be nice and probably also not too hard to implement.

aviral-batra commented 3 years ago

Hi @fonol and @p4nix. if you haven't started working on it, I've written a bit of code for a dialog that allows you to select directories to ignore and then outputs a list of filepaths which you can iterate through. I could open a pull request, but I'm not sure where you would want it to go.

I just wanted to ask what to do.

P.S. I haven't really contributed before so please tell me if I'm doing anything wrong.

I'll leave the code below, just in case:

The dialog:

import math
from typing import Optional, List
import os

import maindialog

from PyQt5.QtCore import Qt
from PyQt5.QtWidgets import QDialog, QListWidgetItem
from PyQt5 import QtCore

from aqt.main import AnkiQt
from aqt.utils import showWarning

class NoteImporterDialog(QDialog):
    def __init__(self, mw: AnkiQt):
        # noinspection PyTypeChecker
        QDialog.__init__(self, parent=mw)
        self.mw = mw
        self.ui = maindialog.Ui_OrganiserDialog()
        self.ui.setupUi(self)

        self.ui.syncButton.clicked.connect(self.add_notes)

        self.ui.dirPathLineEdit.textEdited.connect(self.refresh_dirs_to_ignore_list)

        self.ui.showTopSubdirsCheckbox.stateChanged.connect(self.refresh_dirs_to_ignore_list)
        self.ui.ignoreAllHiddenCheckbox.stateChanged.connect(self.refresh_dirs_to_ignore_list)
        self.ui.dontShowHiddenCheckbox.stateChanged.connect(self.refresh_dirs_to_ignore_list)

    # Slots
    ##########################################################################

    def add_notes(self):
        path = self.ui.dirPathLineEdit.text()
        if os.path.exists(path):
            ignore_list = self.create_ignore_list_from_selection(path)
            list_of_files = return_filepath_list(path, ignore_list, self.ui.ignoreDirsRecursivelyCheckbox.isChecked())

            completed = 0
            for file in list_of_files:  # here you can do what you want with the files
                completed += math.floor(100/len(list_of_files))
                self.ui.progressBar.setValue(completed)
            self.ui.progressBar.setValue(100)

        else:
            showWarning(f"{path} is not a valid directory path.\nPlease try again")

    def refresh_dirs_to_ignore_list(self) -> None:
        path = self.ui.dirPathLineEdit.text()
        self.ui.dirIgnoreLw.clear()
        # if the path input exists, update the list widget with all of the subdirectories
        # otherwise, tell th user that the path does not exist
        if os.path.exists(path):
            if self.ui.showTopSubdirsCheckbox.isChecked():
                subdir_list = [d for d in return_top_level_subdirs(path) if d]
            else:
                subdir_list = [d for d in return_all_subdirs(path) if d]
            # don't show any of the directories that start with '.'
            if self.ui.dontShowHiddenCheckbox.isChecked() or self.ui.ignoreAllHiddenCheckbox.isChecked():
                subdir_list = [d for d in subdir_list if not d.startswith(r".")]

            for d in subdir_list:
                self.add_checkable_item_to_list_view(d)
        else:
            self.add_item_to_list_view("Path entered above does not exist, please enter a real path")

    # Gui functions
    ##########################################################################

    def add_checkable_item_to_list_view(self, text: str) -> None:
        self.ui.lwi = QListWidgetItem()
        self.ui.lwi.setText(text)
        self.ui.lwi.setFlags(self.ui.lwi.flags() | QtCore.Qt.ItemIsUserCheckable)
        self.ui.lwi.setCheckState(QtCore.Qt.Unchecked)
        self.ui.dirIgnoreLw.addItem(self.ui.lwi)

    def add_item_to_list_view(self, text: str):
        self.ui.lwi = QListWidgetItem()
        self.ui.lwi.setText(text)
        self.ui.dirIgnoreLw.addItem(self.ui.lwi)

    # Utility functions
    ##########################################################################

    def create_ignore_list_from_selection(self, path) -> list:
        """Creates a list containing the directories to ignore in the scan"""

        # returns a generator of all the directories to ignore from those selected in the list view
        def return_gen_of_dirs_to_ignore():
            for i in range(self.ui.dirIgnoreLw.count()):
                list_item = self.ui.dirIgnoreLw.item(i)
                if list_item.checkState() == Qt.Checked:
                    yield list_item.text()

        gen = return_gen_of_dirs_to_ignore()
        ignore_list = [os.path.join(path, p) for p in gen]

        # if ignore all hidden is checked, add all the directories that start with '.' to the ignore list
        if self.ui.ignoreAllHiddenCheckbox.isChecked():
            for p in os.listdir(path):
                if p.startswith("."):
                    ignore_list.append(os.path.join(path, p))

        return ignore_list

def return_filepath_list(path: str, list_of_dirs_to_ignore: Optional[list] = None, ign_recursively: bool = True) -> List[str]:
    """Returns list of files in all dirs and subdirs of the dir path given"""
    if list_of_dirs_to_ignore is None:
        list_of_dirs_to_ignore = []
    list_of__file_paths = []

    for subdir, dirs, files in os.walk(path):
        for filename in files:
            filepath = os.path.join(subdir, filename)
            # if it is not a recursive ignore, then check if the subdirectory is in the list of dirs to ignore
            if not ign_recursively and (subdir not in list_of_dirs_to_ignore):
                list_of__file_paths.append(filepath)

            # checks if the subdirectory path starts with any of the values in list of directories to ignore
            # so all of the subdirectories are included in the check of dirs to ignore
            elif ign_recursively and any(map(subdir.startswith, list_of_dirs_to_ignore)):
                list_of__file_paths.append(filepath)

    return list_of__file_paths

def return_all_subdirs(path) -> list:
    dir_list = []
    for subdir, dirs, files in os.walk(path):
        rel_path = os.path.relpath(subdir, start=path)
        if rel_path != '.': # stops it from adding the current directory as '.'
            dir_list.append(rel_path)
    return dir_list

def return_top_level_subdirs(path) -> list:
    return [d for d in os.listdir(path) if os.path.isdir(os.path.join(path, d))]

The file from qt designer, called maindialog.py:

# -*- coding: utf-8 -*-

# Form implementation generated from reading ui file 'maindialog.ui'
#
# Created by: PyQt5 UI code generator 5.15.2
#
# WARNING: Any manual changes made to this file will be lost when pyuic5 is
# run again.  Do not edit this file unless you know what you are doing.

from PyQt5 import QtCore, QtGui, QtWidgets

class Ui_OrganiserDialog(object):
    def setupUi(self, OrganiserDialog):
        OrganiserDialog.setObjectName("OrganiserDialog")
        OrganiserDialog.resize(525, 388)
        sizePolicy = QtWidgets.QSizePolicy(QtWidgets.QSizePolicy.Preferred, QtWidgets.QSizePolicy.Minimum)
        sizePolicy.setHorizontalStretch(0)
        sizePolicy.setVerticalStretch(0)
        sizePolicy.setHeightForWidth(OrganiserDialog.sizePolicy().hasHeightForWidth())
        OrganiserDialog.setSizePolicy(sizePolicy)
        self.verticalLayout = QtWidgets.QVBoxLayout(OrganiserDialog)
        self.verticalLayout.setObjectName("verticalLayout")
        self.dirScanHL = QtWidgets.QFormLayout()
        self.dirScanHL.setObjectName("dirScanHL")
        self.scanPathLabel = QtWidgets.QLabel(OrganiserDialog)
        self.scanPathLabel.setObjectName("scanPathLabel")
        self.dirScanHL.setWidget(0, QtWidgets.QFormLayout.LabelRole, self.scanPathLabel)
        self.dirPathLineEdit = QtWidgets.QLineEdit(OrganiserDialog)
        self.dirPathLineEdit.setObjectName("dirPathLineEdit")
        self.dirScanHL.setWidget(0, QtWidgets.QFormLayout.FieldRole, self.dirPathLineEdit)
        self.verticalLayout.addLayout(self.dirScanHL)
        self.dirIgnoreSettingsHL = QtWidgets.QHBoxLayout()
        self.dirIgnoreSettingsHL.setObjectName("dirIgnoreSettingsHL")
        self.dirIgnoreLabel = QtWidgets.QLabel(OrganiserDialog)
        self.dirIgnoreLabel.setObjectName("dirIgnoreLabel")
        self.dirIgnoreSettingsHL.addWidget(self.dirIgnoreLabel)
        self.verticalLayout.addLayout(self.dirIgnoreSettingsHL)
        self.gridLayout = QtWidgets.QGridLayout()
        self.gridLayout.setObjectName("gridLayout")
        self.ignoreAllHiddenCheckbox = QtWidgets.QCheckBox(OrganiserDialog)
        self.ignoreAllHiddenCheckbox.setObjectName("ignoreAllHiddenCheckbox")
        self.gridLayout.addWidget(self.ignoreAllHiddenCheckbox, 1, 0, 1, 1)
        self.dontShowHiddenCheckbox = QtWidgets.QCheckBox(OrganiserDialog)
        self.dontShowHiddenCheckbox.setObjectName("dontShowHiddenCheckbox")
        self.gridLayout.addWidget(self.dontShowHiddenCheckbox, 0, 1, 1, 1)
        self.showTopSubdirsCheckbox = QtWidgets.QCheckBox(OrganiserDialog)
        self.showTopSubdirsCheckbox.setObjectName("showTopSubdirsCheckbox")
        self.gridLayout.addWidget(self.showTopSubdirsCheckbox, 0, 0, 1, 1)
        self.ignoreDirsRecursivelyCheckbox = QtWidgets.QCheckBox(OrganiserDialog)
        self.ignoreDirsRecursivelyCheckbox.setEnabled(True)
        self.ignoreDirsRecursivelyCheckbox.setCheckable(True)
        self.ignoreDirsRecursivelyCheckbox.setChecked(True)
        self.ignoreDirsRecursivelyCheckbox.setObjectName("ignoreDirsRecursivelyCheckbox")
        self.gridLayout.addWidget(self.ignoreDirsRecursivelyCheckbox, 1, 1, 1, 1)
        self.verticalLayout.addLayout(self.gridLayout)
        self.dirIgnoreLw = QtWidgets.QListWidget(OrganiserDialog)
        self.dirIgnoreLw.setObjectName("dirIgnoreLw")
        self.verticalLayout.addWidget(self.dirIgnoreLw)
        self.syncButtonHL = QtWidgets.QHBoxLayout()
        self.syncButtonHL.setObjectName("syncButtonHL")
        spacerItem = QtWidgets.QSpacerItem(40, 20, QtWidgets.QSizePolicy.Expanding, QtWidgets.QSizePolicy.Minimum)
        self.syncButtonHL.addItem(spacerItem)
        self.syncButton = QtWidgets.QPushButton(OrganiserDialog)
        self.syncButton.setObjectName("syncButton")
        self.syncButtonHL.addWidget(self.syncButton)
        spacerItem1 = QtWidgets.QSpacerItem(40, 20, QtWidgets.QSizePolicy.Expanding, QtWidgets.QSizePolicy.Minimum)
        self.syncButtonHL.addItem(spacerItem1)
        self.verticalLayout.addLayout(self.syncButtonHL)
        self.progressBar = QtWidgets.QProgressBar(OrganiserDialog)
        self.progressBar.setProperty("value", 0)
        self.progressBar.setObjectName("progressBar")
        self.verticalLayout.addWidget(self.progressBar)
        self.scanPathLabel.setBuddy(self.dirPathLineEdit)

        self.retranslateUi(OrganiserDialog)
        QtCore.QMetaObject.connectSlotsByName(OrganiserDialog)

    def retranslateUi(self, OrganiserDialog):
        _translate = QtCore.QCoreApplication.translate
        OrganiserDialog.setWindowTitle(_translate("OrganiserDialog", "External Files Anki Orgnaniser"))
        self.scanPathLabel.setText(_translate("OrganiserDialog", "Path of directory to scan"))
        self.dirIgnoreLabel.setText(_translate("OrganiserDialog", "Select subdirectories/files to ignore"))
        self.ignoreAllHiddenCheckbox.setText(_translate("OrganiserDialog", "Ignore all directories starting with \'.\' in the scan"))
        self.dontShowHiddenCheckbox.setText(_translate("OrganiserDialog", "Don\'t show directories starting with \'.\'"))
        self.showTopSubdirsCheckbox.setText(_translate("OrganiserDialog", "Show only top level directories"))
        self.ignoreDirsRecursivelyCheckbox.setText(_translate("OrganiserDialog", "Ignore selected directories recursively"))
        self.syncButton.setText(_translate("OrganiserDialog", "Import File Notes"))
fonol commented 3 years ago

Thanks, appreciate the effort, I will try it out when I got some time.

fonol commented 3 years ago

I added the dialog at the appropriate place. If you want to play with it, uncomment lin 39 in menubar.py, and then you should be able to access the dialog within the menu.

The dialog code itself is in src/dialogs/importing/general_import.py.

aviral-batra commented 3 years ago

Yep. I opened a pull request to clean up some stuff too.

bengineerdavis commented 3 years ago

This sounds awesome! Were these changes merged? Eager to test :1st_place_medal: