irods / irods_capability_indexing

BSD 3-Clause "New" or "Revised" License
1 stars 11 forks source link

Indexing should maintain a stable number of delayed rule executions #45

Open bh9 opened 4 years ago

bh9 commented 4 years ago

Indexing should create new rule executions only once the previous batch of rule executions haev been completed/removed. Our zone now has 8.2m documents in elasticsearch, but 6.5m rule executions. That means 6.5m files on disk on the indexing provider. If, for example, the batch size defined the maximum number of rules it created at once, this plugin would have a much smaller hit on the available resources on that provider

trel commented 3 years ago

packedRei files are going away in 4.2.9 - does that reduce this concern completely?

bh9 commented 3 years ago

There's still the issue that 6.5m rule executions makes it much harder to see what else is in the list. For us, it probably does remove the concern since we don't use any other delayed rules on the indexed zones

trel commented 3 years ago

So is this a visibility issue? Rather than a load issue?

Can GenQuery filter/find what you’re looking for?

trel commented 3 years ago

Could be implemented as an additional loop around the collection indexer that considers the outstanding queue length before pushing more jobs onto the queue.