Closed hannomalie closed 6 years ago
Thank you for your report. We are aware of those limitations and maybe we could start removing some of them. To be honest, most of our team is on vacation this week, so we’ll have a look as soon as possible 😁
We had a chat about this. The original idea of the whole RESTHeart product has always been to stay as close as possibile to MongoDB functionalities. Now, as you wrote, GridFS does not implement any update semantic because this would involve long-term, non-atomic operations, which are hard to manage correctly:
In GridFS you are not removing/deleting a single document but actually a bunch of documents (files are split into chunks and each chunk is a separate document). That means replacing a file is simply not possible in an atomic manner.
GridFS is kind of a hackish feature. It is often better to just use a separate fileserver with a real filesystem to store the file content and only store the metadata in MongoDB.
Rif: https://stackoverflow.com/a/30074812/615095
We tend to agree with the above point of view. Adding file management transactionality in RESTHeart which alters GridFS semantics could represent for most users a departure from the idea that RESTHEart should copy MongoDB functionalities as much as possibile and could create an additional layer of complexity.
For example, now MongoDB 4.0 is adding ACID transactions, so in the future we will evaluate how to handle those from RESTHeart, but this will happen strictly within the boundaries of what the official Java driver permits.
We'd be more keen to evaluate the possibility to implement a module to interact with a real filesystem instead, for example by adding a configuration which points RESTHeart to a specific mounted path and use it for such purposes. It shouldn't be too difficult to use plain Java APIs to handle that. This way one could decide if needs GridFS (files handled by MongoDB, but with the mentioned limitations) or any "real" external fileserver mounted as a local filesystem. That is just an example, at the moment we don't have existing use cases for that.
Hi :)
Thank you for your effort, perfectly valid points that we can understand completely. On the other hand, we see restheart not as a mongodb facade - but also as an http facade, or as you say for yourself a restful web api. It hides the data storage from our users, or at least provides web operations for it. Don't you think that many of your customers think of it the same way? We're happy that documents and assets are treated nearly the same overall... that's why we (and our customers) expect similar/same behavior and support for HTTP verbs here. Implementing http verbs should not be limited by the driver or the database - and in fact, it isn't. The database or the driver just doesn't allow the nicest implementation, but deleting and recreating an asset should be just fine, because it is an implementation detail.
To implement our (i think very common) use case, we would have to build up a facade around restheart that handles assets. This is very complex and a lot of effort. I created a pull request that shows how an implementation could roughly look like: https://github.com/SoftInstigate/restheart/pull/308 . First tests with heavy load have been successful. I hope that you guys reconsider implementing this feature :)
Hi @hannespernpeintner
We're of course open to discuss this. I added a comment to your PR. It's an incremental feature, so it should not create any regression. It would help a lot if you could also contribute some integration tests. You can have a look at how we write ours.
Thanks! I'll do that as soon as I can :)
The respective PR has been merged, can be closed from my point of view.
Hi @hannespernpeintner, @lenalebt I'm closing this as the requirement has been implemented and will be released in version 3.5.
Thank you both for your contribution!
Feel free to re-open in case.
Hey guys,
thank you for your nice project :)
we have a lot of trouble because of the missing update-file functionality. If an asset already exists, putting the asset again results in 501 NOT IMPLEMENTED. We have two scenarios where this is a problem:
Current Behavior
In the past, we had a single threaded producer where we implemented a GET and conditional DELETE before each PUT. Our setup changed and we now have to handle multiple producers in the same application, which means our solution is no longer valid. We added synchronization on the "asset url" here, so this is pretty much solved. When we have multiple applications however, this method doesn't work. We'd like to have an eTag-based solution here that would work for multiple applications, but having to use DELETE in between makes this impossible.
We have to handle really large files (for example up to 700MB). We experience socket read timeout exceptions in the client here. We have a retry mechanism for every request, that performs the same request again. Despite the exception, the asset seems to be created... so GET for the asset and its binary url returns 202. We do a DELETE as usual. No other client does anything. The asset url as well as the binary url returns 404. The next PUT however, results in 501.
The second scenario could as well occur with multiple applications.
Expected Behavior
Context
We know that the missing put semantic is because of GridFS lacks this feature. However, we think restheart could implement PUT as an internal delete and create somehow. Our expectation is, that we can use put for files in the way it can be used for other documents.
Environment
We're using Linux and Windows, Restheart 3.3.1 (but already tested newer versions), MongoDB version 3.4.10
Steps to Reproduce
Our project setup is too complex to use it here, but my guess is that the following steps would suffice:
Result is 501
Possible Implementation
Here https://github.com/SoftInstigate/restheart/blob/0d87f8bd9b4b40167795fa6201f417561c61d49f/src/main/java/org/restheart/handlers/files/PutFileHandler.java should be a different logic in the exception handler in line ~100. This could be a possible solution: https://stackoverflow.com/questions/41223691/how-to-overwrite-image-in-mongodb-gridfs
Please tell me what you think about this and if you need any further information :) It would be interesting to know if you already investigated in this direction and if you prefer implementing it by yourselves or if you appreciate a pull request for this.
Regards, Hannes