Open willholley opened 10 years ago
I'm excited! Thanks for working on this :+1:
So you're referring to also updating and deleting documents as specified as http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API? Right now Iron Cushion will just insert documents using that API. See the README file.
Maybe we can rename the "bulk insert flags" to "bulk update flags", and add command line flags named create_weight
, update_weight
, and delete_weight
. These will mirror the similarly named flags for CRUD operations. And so the bulk update flags would be:
num_documents_per_bulk_insert
: The number of documents in each bulk operation.num_bulk_insert_operations
: The number of bulk operations performed by each connection.create_weight
: Weight defining the number of create operations relative to other operations.update_weight
: Weight defining the number of update operations relative to other operations.delete_weight
: Weight defining the number of delete operations relative to other operations.So right now the current bulk insert behavior could be replicated using a create_weight
of 1 and a update_weight
and delete_weight
of 0.
If that sounds good, you should be able to lift a lot of the code from method createCrudOperations
in CrudOperations.java
. It's a little opaque, but it efficiently handles ensuring that create operations come before update operations, and update operations come before delete operations.
(Not trying to make things difficult here -- just trying to fit it in at the right place. Again, you should be able to lean heavily on CrudOperations.java
for a lot of it.)
Thanks for the feedback / guidance!
Our use case is that we want to seed the database using bulk docs (as is current behaviour), and then perform a mix of bulk insert, bulk update, bulk delete, insert, update, delete in the second phase.
One solution would be to add more flags to the CRUD phase. However, I suppose one could achieve the same thing by running the "bulk insert / update" stage multiple times...
Bulk document creation seems straightforward to add. However, bulk updates are a bit more tricky. I imagine we could change the fetch/update workflow to be a bulk fetch/bulk update.
Both benchmarks would need a configurable batch size.
I intend to start working on the implementation shortly - this is just a placeholder for comments / discussion until I can get a PR ready.