OSM needs config file(s), and needs to find obsidian.json

peterkaminski commented 1 year ago

@dgou, wdyt?

Considering #7, #8, and #9:

OSM needs to find obsidian.json, and there are well-known typical places on each platform.
Sometimes obsidian.json is not in the typical place, so there needs to be a way to read the right place from a config file (along with environment).
OSM needs to know what directories and files to include and exclude for its operation, so we need a config file for that.

Usability-wise, I would like:

OSM works for most people without configuration.
OSM is easy to configure if you need to configure it.
If you copy just osm.py someplace else (e.g. /usr/local/bin or whatever) and don't copy any other doc or config file, it still works for most people.
If we have configuration data, it happens in one file, not multiple files.

Therefore, my proposal:

Look up the typical Obsidian root directory based on platform and pathlib.Path.home(), similar to Qt's QStandardPaths documentation. (N.B., that code is for a writable dir, not the config dir, so the Windows and Linux paths need to be corrected).
Override the typical Obsidian root directory with a config file and environment variable.
Use one config file, rather than multiple, which means we need to use YAML or TOML (or something worse) rather than one text file per config section.

Massive Wiki Builder uses the --config/-c flag to specify the config file, typically named osm.yaml, but could be any path/filename:

./osm.py -c /path/to/my/osm-config-file.yml [commands]

So, use --config/-c unless there's clearly a problem with that.

Once we have a config file, we might find other uses for it, too, and any additional config data would be added to that one file, rather than adding more text config files.

Prototype of a YAML config file (TOML would be similar but different):

obsidian_root_directory: "/home/peterkaminski.var/app/md.obsidian.Obsidian/config/obsidian"

files_to_copy:
  - include: "."
  - exclude: "*.json"
  - exclude: "workspace*"
  - include: "snippets"
  - exclude: "plugins"
  - include: "plugins/buttons"
  - exclude: "plugins/buttons/styles.css"
  - include: "plugins/tag-wrangler"

dgou commented 1 year ago

@peterkaminski Back atcha!

I am 10000% for having one stand-alone file with optional config, I really like that idea. Those PRs were playing around with operational semantics and config ideas, but they are pretty much orthogonal.

Per the known list of Obsidian directories for config, yeah, ok, putting that in the code isn't my favorite, but given the one-file-to-rule-them-all option, I'm fine with it.

In keeping with a minimal installation, I would prefer JSON over YAML so that it really is a standalone script on a modern system. I am not saying JSON is better, but it avoids all kinds of complex instructions about installing other packages, warning about user installs and virtual environments, etc which I feel are distraction give how purpose-built and focus'd this tool is. Also, once YAML is battery included, JSON is backwards compatible, so folks we would be free to upgrade their configurations without being required to do so for a new release of OSM.

Riff'ing off the "Obsidian known config locations" thing, I would propose the following for the OSM config "search path":

-c/--config command line option
environment variable
current directory
user's home directory
built-in default

And speaking of the built-in default, a short side tangent for context: I use https://github.com/beyondgrep/ack3 as my file searching tool, and I really like the --create-ackrc option for emiting the default configuration so that you can then tweak it. I'm not sanguine about fusing an external and internal configuration, but I really really like that the default configuration is not locked away in the code. The flag name sounds like it will create a configuration file, but it just prints the config on stdout.

So I would like to propose that osm have a --print-default-config option that would do the same thing, so if someone wants to make the jump from no-external-config to having their own, they don't have to cut'n'paste from the source.

peterkaminski commented 1 year ago

@dgou, that all looks great! Let's do it! (By which I mean, whoever gets there first -- I'll ping here if I start working on it.)

fyi, I just added the vim swap thing to .gitignore directly in main.

dgou commented 1 year ago

@peterkaminski I have already started on it. The raw branch (don't look unless you close your eyes and hold your nose) is https://github.com/dgou/obsidian-settings-manager/tree/simple-config Just shows the direction I am going in. If I get it done first I will need to redo the commit history, right now it's very exploratory.

So if you beat me to it, that's cool, I am adulting at home and working on it in dribs and drabs this weekend.

dgou commented 1 year ago

@peterkaminski Actually, take a look at the first few commits, it's where i sketch out how the default OSM config will work, both as a JSON string in the code and the structure of the config format. Changing the code that parses it won't be hard, but it'd be nice to correct course earlier if possible. Thanks!

peterkaminski commented 1 year ago

Looks good to me!

dgou commented 1 year ago

Since it isn't a PR yet, I feel free to force push. Realized I needed to back up and structure the code a bit more for the direction I was going. Since this is a single monolith file I added comments about it to help me remember how it was being done :-)

dgou commented 1 year ago

Ok, so I hit a good stopping point and submitted #12

With the ability to carve up the directory structure arbitrarily as shown in #10, it seems that keeping the same strategy for renaming the files in place with a time-stamp will leave backup files and directories scattered all through the obsidian directory. While we have the --backup-list and --backup-remove options, to help, I think it it's going to be too confusing, and the code will have to go deep to find all the backups. Esp. if people are playing with the copy options now that they will have more flexibility.

I'm not sure what I think the right answer is except "simpler".

peterkaminski commented 1 year ago

My first thought on backup strategy for deep files, without having thought about it too much, is to store the backups in zip files.

import zipfile

try:
    with zipfile.ZipFile(backup_filename, 'w', compression=zipfile.ZIP_DEFLATED) as backup_zip:
        backup_zip.write('file1')
        backup_zip.write('file2')
except FileNotFoundError:
    print("The file you are trying to zip does not exist.")
except PermissionError:
    print("You do not have permission to access or modify this file.")
except Exception as e:
    print("An unexpected error occurred:", e)

dgou commented 1 year ago

@peterkaminski Updates as per commit 19, the commit messages should be pretty self-explanatory. I did not update the README on purpose as this is still rather spikey.

The zip file thing is intriguing. I wonder if we go for super simplicity and just zip the whole ding dang dong config directory. That way it can be put back exactly as it was without any searching, little fiddly file renames, etc. Big hammer it and then maybe later if there is a compelling use case for onesy-twosey backups, add that as an option?

This is evolving into a general sync tool that knows a little bit about Obsidian, and I'd prefer to not to reimplement rsync :-)

Going AFK for adulting most of today, but I think we're pretty close!

Also, putting pen down, so if you want to add commits to the branch, please do!

peterkaminski commented 1 year ago

Looking good, will do, I'll be adulting much of today, too.

And yeah, zipping everything would be super simple and Good Enough to start.

dgou commented 1 year ago

One thing I notice with the current direction of having files listed specifically, the diff output is a lot noisier. Thinking about how I would want to tweak that. Also like the idea of zipping everything. While I was out grocery shopping I realized it would be pretty nice to be able to make backups of the configurations even just before making local "what if I do this" kind of changes... Dropping in some brain noodles before I forget, not working on the code. ((This is a settings "manager" not just a settings copier :-) ))

dgou commented 1 year ago

A few more brain-droppings:

Right now we only backup a file that would be overwritten. But if the update causes a new file to be created, we have no way to "backup" that the new file should be removed. this could be an issue if a common plug-in gets a new file and changes to existing ones. The changes would be backed up, but a rollback would be complex because there is no residue to tell you to remove new files.
All copying is now files, the code needs to make intermediate directories. copytree handles making sure directory permissions are copied over, have to think about how to do this in our new code. (this happens when the source vault has files the destination vault(s) don't. If it was purely replacing only existing files this wouldn't come up.
Definitely going to need a test suite for all of this!

dgou commented 1 year ago

Additional thoughts: For backups:

How about using https://docs.python.org/3/library/shutil.html#shutil.make_archive? We could even put the archive format into the configuration so that folks could choose their own, defaulting to zip
Location: If we archive the entire .obsidian directory, we don't want to write that archive in to the .obsidian directory. I was thinking the backup could be a sibling dot-prefixed file: .osm.backup.<iso-date-time>.<make_archive_format> (The final extension is created by the make_archive function unless we renamed the result ourselves).

peterkaminski commented 1 year ago

Nice catch with the rollback complexity. But it's only a problem if you try to blend the current state with an archived state, right? If you just want to restore an old state, you just blow away the current stuff and unarchive the archive? (Or probably more likely, you unarchive to another directory, and cherry pick by hand to find what you need.)
shutil.make_archive looks great. Definitely need to include archive format in the configuration, as you say.
I think I would have "obsidian" in the archive filename. Also, I wouldn't want to junk up the main directory with lots of archives, so how about putting them all in a directory? So, maybe .obsidian-osm-backups/.obsidian-osm-backup-<iso-date-time>.<make_archive_format>? (Personally I would use dashes as word separators, but dots are fine, too.)
With all the apparatus we'll have to do backups of .obsidian, does it make sense to also be able to do backups of whole vault(s), individually or severally? Not for settings, of course, but as an additional capability for people who would use it?

dgou commented 1 year ago

Yes, and that was kinda what had been happening before with the backed-up files and directories being in place. I really like the conceptual and technical simplicity of keeping the backups separated. And the restore being on the user to decide how/what/which-parts, etc.
Cool.
Yeah, you are right about junking up the main directory. Over lunch I was thinking it might be helpful to have the location be in the config too, with the default of .obsidian-osm-backups (I think obsidian in the name is redundant, but if it is in the config I don't care, I can always change it). Also it means I could put in an absolute path and keep it out of Obsidian's view/awareness altogether. Dashes are fine. And I think maybe we should change the colons "because windows" in the timestamp?
Yeah, I had been thinking about that as well, but with CloudDrive and DropBox and such things, I'm not sure. It's interesting to think about, but I don't want to pull it in to this effort.

I am actually liking where this is going, it is a simpler backup mechanism to implement, and to explain and for users to get their head's around. The change for that will be straightforward. And also during lunch I figured out how I want to tackle making any intermediate directories. Once the list of files being copied is generated, I can also make a map of the containing directories and what statuses they have in the source vault. Then if the directory doesn't exist in the destination, it can be created, but if it already exists they we'll leave it alone. So update will be two passes: Create any needed directories, then copy the files.

I also really want to put in a separate "just do a settings backup" so I have a guard rail for playing with settings in the UI too. Another reason I really like the "back it all up" because I don't have to craft a list of files/directories before hand. Which is part of 4. osm.py --backup <vault> or ./osm.py --backup-all-vaults maybe?

I have limited coding time this week, so I'm eager to get to agreement on what we need to land this PR before spending coding time :-)

peterkaminski / obsidian-settings-manager

OSM needs config file(s), and needs to find obsidian.json #11