Closed djmitche closed 2 months ago
cc @ryneeverett continuing discussion from GothenburgBitFactory/taskwarrior#3029.
A better understanding of why we're dissatisfied with the current interface would be helpful for developing and evaluating solutions. When a refactor doesn't solve a concrete problem or enable a concrete use case, it can be difficult to tell whether it is an actual improvement. In the case of GothenburgBitFactory/taskwarrior#3029, I would suggest that we were uncomfortable with that solution because it was not encapsulated in TaskMut
but rather dependent upon the consumer manually setting the "modified' attribute before calling a method which automatically calls TaskMut::update_modified()
.
(In fact, I can't tell if GothenburgBitFactory/taskwarrior#3029 reliably fixes that. Looking at TDB2::add
, it appears that the "modified" attribute has to be before "status" in the task.all()
iteration or it will get automatically set by TaskMut::set_status()
-> TaskMut::set_string()
-> TaskMut::update_modified()
. Is there any guarantee of that attribute iteration order? I'm not able to follow the Task
construction from CmdImport::importSingleTask
all the way to taskchampion. I think it goes through TCTask
but I don't see exactly where/how a json object of task attributes constructs a Task task (obj);
, so I don't know what kind of object we're dealing with here.)
Does the "impedence mismatch" manifest itself anywhere other than task imports? Can we envision any use case for a lower-level interface other than the import command?
Of the four "dynamic defaults" listed in CmdImport
(uuid, modified, entry, and end), it seems that "modified" is the only one that is dynamically set within the Task
/TaskMut
implementations and the rest are implemented in Replica::new_task
. This suggests that Task
/TaskMut
are the natural low-level interface and we just need to extract the high-level features (automatic TaskMut::update_modified()
and conditional TaskMut::replica.add_to_working_set(self.uuid)
). Some different ideas in no particular order:
CmdImport::importSingleTask
could propagate an argument to construct tasks that already have updated_modified=true
. (E.g.: Context::getContext ().tdb2.add (task, default_attrs=false);
)TaskMut
could have a field storing a HashMap of modified attributes and wait to "commit" them all to the replica at once. Then we wouldn't need to worry about changing the "modified" attribute multiple times and it would only get automatically set if it isn't already in the HashMap when committing. On commit it could conditionally update the working set. It seems like this is essentially the way TaskMut
is supposed to be used anyway since it can't mutate the "modified" attribute more than once. But I'm not sure how easy it would be to implement automatic commit or how much friction requiring a manual TaskMut::commit()
would be.TaskMut::import(&mut self, attrmap: HashMap)
could set updated_modified=true
, conditionally update the working set, and then iterate over the attrmap
, calling self.set_value()
.TaskMut::set(&mut self, attrmap: HashMap)
could be the exclusive public interface. If "modified" is not in attrmap
it gets a dynamic default. Then it conditionally updates the working set, and iterates over the attrmap
, calling self.set_value()
. (In fact, couldn't we have a Task::set
create the TaskMut
and do the whole operation in that method? That question probably reveals that I don't really understand the rationale behind the Task
/TaskMut
distinction.)Does the "impedence mismatch" manifest itself anywhere other than task imports? Can we envision any use case for a lower-level interface other than the import command?
It does, and maybe import is not the best angle to view it from. Summarizing the "mismatch" as concisely as I can:
TW treats tasks as key/value maps, and coordinates updates to the keys for a task to accomplish higher-level things like task N start
or task N mod +foo -bar wait:2d
.
TC stores tasks as key/value maps internally, but presents them as things with meaningful attributes like "entry" or "status" or "description". It has methods for each of those, with typed inputs. And, it does some of the coordination based on those attributes, such as updating "modified" when a task is modified, or moving a task into the working set when its status becomes "pending", or updating the dependency graph when a dependency is changed.
The idea was that users of TC as a library would not need to remember the particulars of how "entry" is formatted or that "modified" has to be updated -- they could just call simple functions like t.start()
or t.add_annotation(..)
and the right key/values would be updated. But, that's very much not how TW is built.
It seems like we should have a "high level" API (with t.start()
etc.) and a "low level" API (that TW can use, and that doesn't do any automatic coordination).
I like the idea of batching updates, too, especially in the low-level API. That might allow us to get rid of the Task/TaskMut distinction, somewhat like you've suggested, by only requiring an exclusive reference to a Replica for Task::commit
. There are a few spots where Taskwarrior modifies tasks in memory and then gives an "are you sure" prompt to the user; a separate Task::commit
operation might support that better, too (my memory is a little vague on this point, sorry).
The rationale for Task/TaskMut is this: we want to be able to hold onto several Tasks at the same time for purposes of reading data from the. However, you need an exlusive reference (&mut) to a Replica in order to make changes to tasks -- that's literally mutating the replica, so it makes sense. So if there were no TaskMut and every Task carried a mutable reference to a Replica, then you could only have one Task at a time, and couldn't do anything else with the Replica while it existed.
I've gotten a start to this now, in #413. I'll leave some notes here that I will probably return to and update later.
Operation
instances, collected in an Operations
type that is a thin wrapper around Vec<Operation>
. New operations can be created in a variety of ways such as Replica::new_task
, Task::set_property
, or Task::delete
, all of which take &mut Operations
. (Note that this is, in essence, #373)Operations
to a Replica method.Operations
, and committing that undo involves reversing the operations and committing the result, with an optimization of "popping" matching operations from the DB's operations (as happens now).taskdb::apply
, which is basically the "commit" operation above, currently takes a SyncOp and creates an Operation from it. An Operation contains the information in a SyncOp but also undo-related information. Maybe it is more precise to capture this undo-related information at the time that the change is committed, but that would mean throwing out any undo-related information provided during the accumulation of the Operations
.Task
API would also stick around -- or something like it -- for users that want to operate at that higher level. For example, an Android app for adding new tasks on voice command probably doesn't want to be bothered with how the modified
property is encoded. It just wants to call something like replica.add_task()
and then task.set_description()
, etc. and have everything work out.For read-only purposes, TW's Task
mirrors the first bullet point above: they are little more than std::map <std::string, std::string> data
. So, they could easily be adjusted to contain a tc::Task representing the same thing.
The existing TDB2::modify
method takes a Task, compares it to the one in the DB, and makes the necessary changes such that the DB matches the given Task. If Task
carries an Operations
recording the changes made to it, then TDB2::modify
would just be committing those changes.
All of the other operations are more immediate: deleting, creating, etc. Those can be accomplished by creating the necessary operations and immediately committing them.
A bit more progress now. I think we can do this without breaking changes to the public API, and I'd like to get that merged first (it will be split over a few PRs).
Once that's done, I think we can actually do a much better job with the "high-level" API, too. Since nothing really uses those, I think this is a good time to break them.
The sequence of PRs I've got in https://github.com/GothenburgBitFactory/taskchampion/compare/main...djmitche:taskchampion:issue372-operations ends up creating a new BasicTask
that just exposes getters and setters, with the setters generating a sequence of Operations that can later be committed. The existing functionality (Task, TaskMut and Replica functions like delete_task
) are all rewritten in terms of this new type, without breaking API changes (I thope!).
I'm not sure what to do next. I kind of dislike the name BasicTask and would like to just have one Task type. But, I'd also like to have a clear distinction between low-level and high-level functionality in the generated docs. A few ideas I've had:
task.set_status(Status::Completed, &mut ops)
and move them to the Task type. This would be a breaking API change, but keeps the high-level interface while allowing multiple operations to be batched into a single transaction. However, it still has two task types (BasicTask and Task)I'm open to opinions or other ideas!
Another idea, though I doubt it is better:
Task
a wrapper around BasicTask
by replacing the uuid
and taskmap
properties with a single basictask
property and implementing the higher level interface entirely by using the lower level interface. If you wanted to expose the low-level interface also you could define the property as pub basictask
.Edit: Reflecting on this a bit more, I think I can make a case that this option has many of the benefits of the other options without the drawbacks:
That makes a lot of sense - thanks! I appreciate your careful thought about this.
Current status: this is largely finished in taskchampion, and I'm working on building the FFI interface using https://cxx.rs in Taskwarrior. That's already led to a few relatively minor local changes in TC, and may generate a few more, Once I've got all of that working I'll put the changes up for review.
The problem that brought me [to GothenburgBitFactory/taskwarrior#3029] is, Taskwarrior mostly doesn't call functions like
set_description
oradd_annotation
-- it interfaces directly with the key/value pairs that make up a task, viaset_value
. But it does callset_status
since that updates the working set. So the issue was that Taskwarrior would callt.set_value("modified", "12345")
t.set_status(Status::Pending)
and the second call would not "see" that the modified timestamp had already been set, and would do so again. On import, this causes a problem by setting the "modified" timestamp of all tasks to the time the import occurred.
Overall, this is a major "impedance mismatch" between TC and TW: TW wants to handle tasks as key/value maps, while TC was really designed for API consumers to call methods that embody more knowledge about the meaning of particular keys. These are more than just convenience wrappers, though: they also have this set-the-modified-timestamp thing, as well as automatically updating the working set. And (referring to GothenburgBitFactory/taskwarrior#3028) it might be good to cache parent/child relationships as well, to avoid having to load most of the pending tasks one-by-one. I've been mulling this over for a while but haven't come up with a concrete way to do better.
Originally posted by @djmitche in https://github.com/GothenburgBitFactory/taskwarrior/issues/3029#issuecomment-1414569579