Open chewbranca opened 2 years ago
@chewbranca good idea to have better priority classes!
Replicator db access from the replicator could be a system
db access or a have its own replicator
class. But I can see the user accessing db (user creating a replication doc) staying as an interactive request?
the replicator still allowed for doing a direct couch_changes fetch of the local database
I think that's just a left-over clause we didn't clean up when we removed the local
access.
Replicator db access from the replicator could be a system db access or a have its own replicator class. But I can see the user accessing db (user creating a replication doc) staying as an interactive request?
Yeah I think system
for the replicator db access would work fine.
I do like the idea of being able to prioritize replicator traffic at a different priority level than interactive
, I think that would be a useful addition allowing for replication traffic to be deprioritized a bit relative to normal interactive client requests. This would be a bigger change than just properly setting io_priority
values in places where it's currently absent, as I believe for that to work we would need to do something like extrapolate from the http request user agent that it's Couch-Replicator
and then propagate the replication
io_priority from the coordinator node to all the rpc nodes.
I think that's just a left-over clause we didn't clean up when we removed the local access.
Makes sense 👍
I've got two PRs out for this, one in the main CouchDB repo https://github.com/apache/couchdb/pull/4106 and a corresponding PR in the IOQ repo: https://github.com/apache/couchdb-ioq/pull/19
Can this be closed?
Summary
There are a number of codepaths performing IO operations that do not set an
io_priority
flag, so they do not properly get labeled or prioritized by IOQ and default toother
io_class [1]. Many moons ago I made a PR [2] that went through and properly tagged anywhere I could find making IO requests without anio_priority
set. It looks like the main gaps around indexing and indexer compaction were fixed in [3], however there are still a number of places lacking anio_priority
value.I also think it would be worthwhile to introduce at least one, perhaps two, additional IO classes. I think we should add a
system
io_priority class for things like auth cache, loading mem3 shards, global changes, couch_per_user, etc. I think it might also be worthwhile to add a dedicatedreplication
io_priority, although it kind of looks like the replicator no longer does local db replications? In [2], the replicator still allowed for doing a directcouch_changes
fetch of the local database, so perhaps my original concern around the replicator doing direct changes reads of full databases going unflagged with io_priority is no longer relevant. We should probably still tag the replicator doc updates with eithersystem
orreplication
though.Desired Behaviour
To minimize (or eliminate) the use of the fallback
other
io_priority class so that all IO is properly flagged with the appropriate IO class.Possible Solution
The PR in [2] has a number of simple additions of where to put the
io_priority
process dictionary value when missing. I tracked those down by modifying the code to throw an exception when absent and then iterated through all the locations I could find. I found many, but the approach was not fully exhaustive so I don't think we should crash by default when there's noio_priority
set.Additional context
[1] https://github.com/apache/couchdb/blob/main/src/ioq/src/ioq.erl#L84-L85 [2] https://github.com/apache/couchdb/pull/1998/files [3] https://github.com/apache/couchdb/commit/13bf0eb80813477bf3fbe79cffaf0faddffaaca9