arangodb / arangodb-java-driver

The official ArangoDB Java driver.
Apache License 2.0
202 stars 94 forks source link

Implement GridFS #265

Open ramazanpolat opened 5 years ago

ramazanpolat commented 5 years ago

Since ArangoDB competes with MongoDB, that would be nice to have something like GridFS implemented in driver.

In GridFS documentation, GridFS is defined as:

GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16 MB

I know ArangoDB doesn't have 16 MB limit but GridFS is not just for exceeding document size limits. It is used to store files in chunks. This will have some benefits for some use cases. For example putting profile picture to ArangoDB collection will make developers life much more easier. In past, I've used to store TBs of data in MongoDB that is accessed occasionally and it saved me lots of time to look for a distributed file system. I'd like to see ArangoDB drivers also implement GridFS. In fact, I may try to implement it if this gets accepted.

dothebart commented 5 years ago

probably related to https://github.com/arangodb/arangodb/issues/107

rashtao commented 4 years ago

I think this is not driver related: if the db will support in the future binary fields/objects the driver will support them as well.

Currently you can achieve something similar from the driver encoding/decoding binary data to/from base64 encoded strings, so saving strings in the db. Although in this case the stored data would be bigger than its binary representation, network transfers would be proportionally slower and the client would have to consume cpu for encoding/decoding. Also pay attention to these limitations: https://github.com/arangodb/arangodb/issues/10754

ramazanpolat commented 4 years ago

I think this is not driver related: if the db will support in the future binary fields/objects the driver will support them as well.

In case of MongoDB, GridFS is just a driver implementation of splitting files into chunks and inserting into a collection. MongoDB itself doesn't have any specific part for GridFS, it is all about the driver which does the job.

Currently you can achieve something similar from the driver encoding/decoding binary data to/from base64 encoded strings, so saving strings in the db.

This is actually what GridFS does.

Although in this case the stored data would be bigger than its binary representation, network transfers would be proportionally slower and the client would have to consume cpu for encoding/decoding.

Yes, you are right. The file will be bigger. But this is a trade-off which in some cases users will favor using it, especially for small sized files like profile pictures. Same trade-off also applies for GridFS.

Also pay attention to these limitations: arangodb/arangodb#10754

There are similar limitaions in MongoDB and this is why they come up with GridFS specification in first place.

rashtao commented 4 years ago

Thanks for clarifying, in general I like the idea. I would proceed defining a specification which will be then implemented by all the drivers.