tomdesair / tus-java-server

Library to receive tus v1.0.0 file uploads in a Java server environment
MIT License
128 stars 60 forks source link

UploadStorageService - Files in the disk and Info in the database #22

Closed ksvraja closed 4 years ago

ksvraja commented 5 years ago

Expected Behaviour

Hi Tom, thanks for the great work.. the library works great for our file transfer use case.

We are currently using springframework.. and we have the below requirements

  1. Store the files in a particular hierarchy - /uploads/folder1/folder2/filename.zip
  2. Store the UploadInfo in the database instead of local folder
  3. provide the filereference directly to UploadIdFactory instead of parsing from the URL (spring already do this)

Can you provide guidelines on how to implement the same. We can develop the same and share.

softboy99 commented 5 years ago

we also need this.

tomdesair commented 5 years ago

Hi @ksvraja,

Thank you for your feedback.

  1. Using the withStoragePath(String) method of the TusFileUploadService class, you should be able to configure the path /uploads/folder1/folder2. If the path needs to be different for each upload, then I would propose to create your own custom DiskStorageService class and provide it via withUploadStorageService method.
  2. I have been thinking about this since I've read your message a few months ago. If I would add a database-based UploadStorageService, I want to make sure that it is compatible with different database technologies like MySQL, PostgreSQL, Oracle... This would then also require a database-based UploadLockingService. So I want to keep it as simple as possible. Currently I'm thinking on a table structure like:
    • id: varchar, primary key, not null -> the upload ID
    • creation_timestamp: datetime, not null -> timestamp the upload was created
    • last_update_timestamp: datetime, not null -> timestamp the upload was last updated
    • locked: boolean, not null -> Indicate if the upload is locked for processing or not. This is need to provide a database-based UploadLockingService
    • metadata: blob, not null -> Byte-array that contains the serialized form of the UploadInfo object
    • data: blob, null -> Byte-array that contains the bytes of the actual upload. I want to give users the option to also store the upload data itself in the DB. However, the default would still be to write uploaded bytes to disk.
  3. I'm not sture what you mean with "provide the filereference directly to UploadIdFactory". Do you mean a new method like UploadId createUploadId(String uploadId)?

What do you think about the database approach? If you agree, I can try to free up some time to build this into the library.

ksvraja commented 5 years ago

Hi Tom, In our current project, We have implemented this functionality as below.

  1. Defined an interface to store / retrieve the metadata
public interface IndexStorage {
    public UploadInfo create(UploadInfo info, String fileName);

    public UploadInfo update(UploadInfo info) throws UploadNotFoundException;

    public UploadInfo get(UploadInfo info) throws UploadNotFoundException;

    public UploadInfo getByFilePath(String filePath) throws UploadNotFoundException;

    public UploadInfo get(UploadId id) throws UploadNotFoundException;

    public UploadInfo get(long id /* Primary key*/) throws UploadNotFoundException;

    public int delete(UploadId id);

    public String getFilePath(UploadInfo info) throws UploadNotFoundException;

    public String newFilePath(UploadInfo info);
}
  1. Provide the implementation of IndexStorage - we have implemented this one for MariaDB.
  2. A modified version of DiskStorageService class to utilize the IndexStorage interface.
  3. We currently utilize the disk based UploadLockingService.

My thought on how this implementation could be -

  1. Interface definition to access the metadata (.info)
  2. DiskStorageService Modified to utilize this interface definition.
  3. An implementation to handle disk based metadata storage.
  4. The developers can write their metadata implementation for database based access (this could be RDBMS or NoSQL)

Providing a standard implementation of database based access might bring the complexity of providing the datasource implementation (hibernate, jdbc template, etc) and dependent libraries. However, if you can provide an reference / example implementation that would be great, But it will not be a good idea to embed inside the TUS library.

Personally I prefer the TUS library with minimal dependency. This makes adopting/embedding into the projects easier.

I can contribute on this enhancement. Please let me now your thoughts.

git-josip commented 4 years ago

Hi @ksvraja . Any chance that you can share your implementation of custom DiskStorageService ?

nseb commented 4 years ago

@Hi @ksvraja , Can you share your implementation ?

ksvraja commented 4 years ago

I have shared the code reference here - https://github.com/ksvraja/tus_ext We have deeply integrated the db implementation within our framework. The code may not be ready to use, but can be used for reference.