GothenburgBitFactory / taskchampion

Personal task-tracking library
MIT License
79 stars 20 forks source link

Reconsider "level" of the Task/TaskMut interface #372

Closed djmitche closed 2 months ago

djmitche commented 1 year ago

The problem that brought me [to GothenburgBitFactory/taskwarrior#3029] is, Taskwarrior mostly doesn't call functions like set_description or add_annotation -- it interfaces directly with the key/value pairs that make up a task, via set_value. But it does call set_status since that updates the working set. So the issue was that Taskwarrior would call

and the second call would not "see" that the modified timestamp had already been set, and would do so again. On import, this causes a problem by setting the "modified" timestamp of all tasks to the time the import occurred.

Overall, this is a major "impedance mismatch" between TC and TW: TW wants to handle tasks as key/value maps, while TC was really designed for API consumers to call methods that embody more knowledge about the meaning of particular keys. These are more than just convenience wrappers, though: they also have this set-the-modified-timestamp thing, as well as automatically updating the working set. And (referring to GothenburgBitFactory/taskwarrior#3028) it might be good to cache parent/child relationships as well, to avoid having to load most of the pending tasks one-by-one. I've been mulling this over for a while but haven't come up with a concrete way to do better.

Originally posted by @djmitche in https://github.com/GothenburgBitFactory/taskwarrior/issues/3029#issuecomment-1414569579

djmitche commented 1 year ago

cc @ryneeverett continuing discussion from GothenburgBitFactory/taskwarrior#3029.

ryneeverett commented 1 year ago

A better understanding of why we're dissatisfied with the current interface would be helpful for developing and evaluating solutions. When a refactor doesn't solve a concrete problem or enable a concrete use case, it can be difficult to tell whether it is an actual improvement. In the case of GothenburgBitFactory/taskwarrior#3029, I would suggest that we were uncomfortable with that solution because it was not encapsulated in TaskMut but rather dependent upon the consumer manually setting the "modified' attribute before calling a method which automatically calls TaskMut::update_modified().

(In fact, I can't tell if GothenburgBitFactory/taskwarrior#3029 reliably fixes that. Looking at TDB2::add, it appears that the "modified" attribute has to be before "status" in the task.all() iteration or it will get automatically set by TaskMut::set_status() -> TaskMut::set_string() -> TaskMut::update_modified(). Is there any guarantee of that attribute iteration order? I'm not able to follow the Task construction from CmdImport::importSingleTask all the way to taskchampion. I think it goes through TCTask but I don't see exactly where/how a json object of task attributes constructs a Task task (obj);, so I don't know what kind of object we're dealing with here.)

Does the "impedence mismatch" manifest itself anywhere other than task imports? Can we envision any use case for a lower-level interface other than the import command?

Of the four "dynamic defaults" listed in CmdImport (uuid, modified, entry, and end), it seems that "modified" is the only one that is dynamically set within the Task/TaskMut implementations and the rest are implemented in Replica::new_task. This suggests that Task/TaskMut are the natural low-level interface and we just need to extract the high-level features (automatic TaskMut::update_modified() and conditional TaskMut::replica.add_to_working_set(self.uuid)). Some different ideas in no particular order:

  1. CmdImport::importSingleTask could propagate an argument to construct tasks that already have updated_modified=true. (E.g.: Context::getContext ().tdb2.add (task, default_attrs=false);)
  2. TaskMut could have a field storing a HashMap of modified attributes and wait to "commit" them all to the replica at once. Then we wouldn't need to worry about changing the "modified" attribute multiple times and it would only get automatically set if it isn't already in the HashMap when committing. On commit it could conditionally update the working set. It seems like this is essentially the way TaskMut is supposed to be used anyway since it can't mutate the "modified" attribute more than once. But I'm not sure how easy it would be to implement automatic commit or how much friction requiring a manual TaskMut::commit() would be.
  3. A high-level TaskMut::import(&mut self, attrmap: HashMap) could set updated_modified=true, conditionally update the working set, and then iterate over the attrmap, calling self.set_value().
  4. A high-level TaskMut::set(&mut self, attrmap: HashMap) could be the exclusive public interface. If "modified" is not in attrmap it gets a dynamic default. Then it conditionally updates the working set, and iterates over the attrmap, calling self.set_value(). (In fact, couldn't we have a Task::set create the TaskMut and do the whole operation in that method? That question probably reveals that I don't really understand the rationale behind the Task/TaskMut distinction.)
djmitche commented 1 year ago

Does the "impedence mismatch" manifest itself anywhere other than task imports? Can we envision any use case for a lower-level interface other than the import command?

It does, and maybe import is not the best angle to view it from. Summarizing the "mismatch" as concisely as I can:

The idea was that users of TC as a library would not need to remember the particulars of how "entry" is formatted or that "modified" has to be updated -- they could just call simple functions like t.start() or t.add_annotation(..) and the right key/values would be updated. But, that's very much not how TW is built.

It seems like we should have a "high level" API (with t.start() etc.) and a "low level" API (that TW can use, and that doesn't do any automatic coordination).

I like the idea of batching updates, too, especially in the low-level API. That might allow us to get rid of the Task/TaskMut distinction, somewhat like you've suggested, by only requiring an exclusive reference to a Replica for Task::commit. There are a few spots where Taskwarrior modifies tasks in memory and then gives an "are you sure" prompt to the user; a separate Task::commit operation might support that better, too (my memory is a little vague on this point, sorry).

The rationale for Task/TaskMut is this: we want to be able to hold onto several Tasks at the same time for purposes of reading data from the. However, you need an exlusive reference (&mut) to a Replica in order to make changes to tasks -- that's literally mutating the replica, so it makes sense. So if there were no TaskMut and every Task carried a mutable reference to a Replica, then you could only have one Task at a time, and couldn't do anything else with the Replica while it existed.

djmitche commented 3 months ago

I've gotten a start to this now, in #413. I'll leave some notes here that I will probably return to and update later.

Notes

General Goal

Complications

Use by Taskwarrior

For read-only purposes, TW's Task mirrors the first bullet point above: they are little more than std::map <std::string, std::string> data. So, they could easily be adjusted to contain a tc::Task representing the same thing.

The existing TDB2::modify method takes a Task, compares it to the one in the DB, and makes the necessary changes such that the DB matches the given Task. If Task carries an Operations recording the changes made to it, then TDB2::modify would just be committing those changes.

All of the other operations are more immediate: deleting, creating, etc. Those can be accomplished by creating the necessary operations and immediately committing them.

djmitche commented 3 months ago

A bit more progress now. I think we can do this without breaking changes to the public API, and I'd like to get that merged first (it will be split over a few PRs).

Once that's done, I think we can actually do a much better job with the "high-level" API, too. Since nothing really uses those, I think this is a good time to break them.

djmitche commented 3 months ago

The sequence of PRs I've got in https://github.com/GothenburgBitFactory/taskchampion/compare/main...djmitche:taskchampion:issue372-operations ends up creating a new BasicTask that just exposes getters and setters, with the setters generating a sequence of Operations that can later be committed. The existing functionality (Task, TaskMut and Replica functions like delete_task) are all rewritten in terms of this new type, without breaking API changes (I thope!).

I'm not sure what to do next. I kind of dislike the name BasicTask and would like to just have one Task type. But, I'd also like to have a clear distinction between low-level and high-level functionality in the generated docs. A few ideas I've had:

I'm open to opinions or other ideas!

ryneeverett commented 3 months ago

Another idea, though I doubt it is better:

Edit: Reflecting on this a bit more, I think I can make a case that this option has many of the benefits of the other options without the drawbacks:

djmitche commented 2 months ago

That makes a lot of sense - thanks! I appreciate your careful thought about this.

djmitche commented 2 months ago

Current status: this is largely finished in taskchampion, and I'm working on building the FFI interface using https://cxx.rs in Taskwarrior. That's already led to a few relatively minor local changes in TC, and may generate a few more, Once I've got all of that working I'll put the changes up for review.