feathersjs / feathers

The API and real-time application framework
https://feathersjs.com
MIT License
15.05k stars 751 forks source link

File upload service #1052

Open ranhsd opened 6 years ago

ranhsd commented 6 years ago

Hi, I am building an app and want to use feathers server side only. I am planning to migrate my current server side implementation from another NodeJS framework (that I currently use) and use FeathersJS. The reason I'm doing it is because FetahersJS allows me to create multiple services that each one of them can use different database.

One of the features that I noticed that FeathersJS don't solve in the Framework is files upload. I also noticed that there are some questions about it on GitHub as well as on Stackoverflow.

After reading the questions, guides (like this one: https://github.com/feathersjs/docs/blob/master/guides/advanced/file-uploading.md) I came up with a solution and I really need you to share your thoughts about it.

Before I start describe the solution, here are some assumptions:

My solution goes like this:

The service

I started by generating a new Feathers service via the service CLI. the service name is files.

Next, I npm installed multer which allows me to upload one or multiple files and handle the multipart/form-data header

My files.service.js content looks like the following:

// Initializes the `files` service on path `/files`
const createModel = require('../../models/files.model')
const hooks = require('./files.hooks')
const createService = require('feathers-mongoose')

const multer = require('multer')
const multipartMiddleware = multer()

module.exports = function (app) {
  const Model = createModel(app)
  const paginate = app.get('paginate')

  const options = {
    name: 'files',
    Model,
    paginate
  }

  // Initialize our service with any options it requires
  app.use('/files', 

    multipartMiddleware.array('file',parseInt(process.env.FILES_SERVICE_MAX_ITEMS) || 1),

    function (req, res, next) {
      req.feathers.files = req.files;
      next();
    },    

    createService(options)
  );

  // Get our initialized service so that we can register hooks and filters
  const service = app.service('files')

  service.hooks(hooks)
};

You can see that I use multer array because this solution allow users to upload multiple files. From the code above you can see that you can easily change the number of files via the FILES_SERVICE_MAX_ITEMS env variable (I am running feathers in docker + docker-compose locally and on Kuberentes remotely so it's very easy and straight forward to define env variable there)

You can also noticed that I didn't perform a lot of changes to the current service implementation and it's very basic.

Before create hook

I generated a before create hook again using the Feathers CLI and name it uploadFilesToGcs. This hook will get the file content and metadata and will use google cloud storage SDK to upload them to google cloud storage

This is the code inside the upload to gcs file:

// Use this hook to manipulate incoming or outgoing data.
// For more information on hooks see: http://docs.feathersjs.com/api/hooks.html

const storage = require('@google-cloud/storage')
const { randonHexString } = require('../utils')
const { BadRequest } = require('@feathersjs/errors')

const gcs = storage({
  projectId: process.env.GCP_PROJECT_ID || undefined,
  keyFilename: process.env.GCP_KEYFILE_PATH || undefined,
})

const bucket = gcs.bucket(process.env.GCS_BUCKET)

module.exports = function (options = {}) { // eslint-disable-line no-unused-vars
  return async context => {

    const files = context.params.files

    if (!files || files.length === 0) {
      throw new BadRequest("No files found to upload")
    }

    let promises = []

    files.forEach(file => {
      let promise = new Promise((resolve, reject) => {

        const fileName = randonHexString(32) + '_' + file.originalname
        const gcsFile = bucket.file(fileName)
        const mimeType = file.mimetype

        let resultFile = {
          bucket: bucket.name,
          provider: "google",
          name: fileName,
          contentType: mimeType,
          originalName: file.originalname
        }

        const stream = gcsFile.createWriteStream({
          public: true,
          metadata: {
            contentType: mimeType
          }
        });

        stream.on('error', (err) => {
          return reject(err)
        });

        stream.on('finish', () => {
          resolve(resultFile)
        });

        stream.end(file.buffer);
      })

      promises.push(promise)
    });

    const result =  await Promise.all(promises)

    context.data = result

    return context

  };
};

In the code above you can see that I first intiialized the google cloud SDK with my project id and service account key (both values are saved in the env variables of my app), then I performed a very simple validation and check that I really have files to work with, then I created a promise array to upload all files to google cloud storage via the google cloud SDK. Finally, I pass the results to the hook context data.

The context data will contain an array of objects that will be stored in the database (in my case I use MongoDB but of course Feathers allows me to use any other SQL/noSQL database). The object that will be stored in the database will have the following structure:

module.exports = function (app) {
  const mongooseClient = app.get('mongooseClient');
  const { Schema } = mongooseClient;
  const files = new Schema({
    name: { type: String },
    originalName: { type: String },
    contentType: { type: String },
    bucket: { type: String },
    provider: { type: String }
  }, {
      timestamps: true
    });

  return mongooseClient.model('files', files);
};

name - is the name of the files that I generated to make sure it will be unique originalName - is the original name of the file as sent by the client contentType - is the mime type as sent from the client (with the help of multer) bucket - the bucket name where the items are stored provider - the provider who host it. In my case I use google

As you can see I don't store the item url here because I can calculate it later (in the after find and after get hooks). I also don't need to store the bucket and provider here but I decided to do so because this solution now allows me to use one service with multiple provider so from the client I can decide that I want to upload file to S3 and the only think that I will need to do is in the before create hook to check that payload and use upload to amazon S3 adapter instead of google cloud storage so this solution actually allows me to use multiple storage side by side.

After find + after get hooks

Each of the files must have a URL so users will be able to access it from within the app, browser etc. Like I mentioned above I prefer to not store absolute urls in my database from various reasons like: to be storage agnostic and also I cannot count on the storage provider that they will not modify their storage URL in the future so that's why I decided to "calculate" the service URL in the application layer. For that purpose I didn't generated any hook (like I did above) and go to the "quick and dirty" solution and write the code inside the files.hook.js file directly but you can definitely generate a new hook for that to create a cleaner solution.

Here is the content of files.hook.js file:

const { authenticate } = require('@feathersjs/authentication').hooks;

const uploadFileToGcs = require('../../hooks/upload-file-to-gcs');

module.exports = {
  before: {
    all: [
      authenticate('jwt')
    ],
    find: [],
    get: [],
    create: [
      uploadFileToGcs()
    ],
    update: [],
    patch: [],
    remove: []
  },

  after: {
    all: [

    ],
    find: [
      hook => {
        hook.result.data.forEach(result => {
          handleResult(result)
        })
      }
    ],
    get: [
      hook => {
        handleResult(hook.result)
      }
    ],
    create: [

      hook => {
        hook.result.forEach(result => {
          handleResult(result)
        })
      }

    ],
    update: [],
    patch: [],
    remove: []
  },

  error: {
    all: [],
    find: [],
    get: [],
    create: [],
    update: [],
    patch: [],
    remove: []
  }
};

const handleResult = result => {
  if (!result.url) {
    result.url = `https://storage.googleapis.com/${result.bucket}/${result.name}`
  }
  if (result.bucket) {
    delete result.bucket
  }

  if (result.provider) {
    delete result.provider
  }
}

From the code above it's easy to understand that I build the file url manually by concatenate the bucket name and the file name to the storage URL. Also, I remove the bucket name and the provider name from the result because the user shouldn't care about it.

Don't forget that I can access to provider and bucket fields here and build the url according to the provider name so if it will be "aws" then I can simply use amazon S3 storage URL but because this solution is for google cloud storage only I don't even check for the provider name.

How to test it

To test this solution you first must have a running feathersJS app. Then you need to npm install multer and google cloud storage sdk and after your server is running use any REST client (I prefer to use Postman because it's nice and easy).

image

From the image above you can see that I pass a JWT token in the Authorization header but FeathersJS allows you to change it and expose it publicly.

Also please notice to the Content-Type header. Multer expect that it will be equal to multipart/form-data

This is how my request body looks like:

image

As you can see from the image above. I can upload 2 photos in one service call. Of course you can modify multer config and upload more than 2 but please notice that it is consume memory since photos are kept in memory during the request.

Finally, here is the response:

[
    {
        "_id": "5bbf1636eb2dee028f9ca5ae",
        "name": "********_image1.jpg",
        "contentType": "image/jpeg",
        "originalName": "image1.jpg",
        "createdAt": "2018-10-11T09:21:58.489Z",
        "updatedAt": "2018-10-11T09:21:58.489Z",
        "__v": 0,
        "url": "https://storage.googleapis.com/********-draft/********_image1.jpg"
    },
    {
        "_id": "5bbf1636eb2dee028f9ca5af",
        "name": "********_image2.png",
        "contentType": "image/png",
        "originalName": "image2.png",
        "createdAt": "2018-10-11T09:21:58.489Z",
        "updatedAt": "2018-10-11T09:21:58.489Z",
        "__v": 0,
        "url": "https://storage.googleapis.com/********-draft/********_image2.png"
    }
]

I use ** to mask the data :)

That's it. Please let me know what you think about this solution. I was thinking maybe we can also extract this solution to a different module something like feathers-storage or something else and this solution will define an interface that will be implemented by various providers (e.g. google, aws, azure, grid store and more)

Thanks in advance! Ran.

aessig commented 5 years ago

@ranhsd thanks for your comments. It's working fine for me with Postman. I just can't figure out how to send files from the browser using socket-io as the transporter.

const filepath = 'User/abc/test.png';
feathers.service('upload').create({
 param1: 12345,
}, {
 headers: {
   'Content-Type': 'multipart/form-data',
 },
});

Any ideas about how it could be done?

ranhsd commented 5 years ago

Hi @aessig Actually I didn't try it with Socket.IO I use sockets only for real time in my app, all other things I prefer to do stateless (using REST)

soesujith commented 4 years ago

@ranhsd thanks for your comments. It's working fine for me with Postman. I just can't figure out how to send files from the browser using socket-io as the transporter.

const filepath = 'User/abc/test.png';
feathers.service('upload').create({
 param1: 12345,
}, {
 headers: {
   'Content-Type': 'multipart/form-data',
 },
});

Any ideas about how it could be done?

Hi @aessig, Would you able to figure out how to upload from browser using socket-io? I have the same issue, which works in postman.

ranhsd commented 4 years ago

Hi @sbsujith I think you don't need to use websockets for uploading files. For this specific operation you can use REST (this is what I did) FeathersJS gives you the ability to have 2 transports for your services (REST and Socket) so just use REST for this specific service

soesujith commented 4 years ago

Thanks @ranhsd for the response. I could access the file over socket-io in server at context.data.files instead of context.params.files

ranhsd commented 4 years ago

Awesome @sbsujith

fitsumbelay commented 4 years ago

for those who are looking this or facing the same issue. here is a solution u can use drop zone on socket.io or API to upload file to your database. you have to using hooks. 1 first if u are using normal file upload or dropzone u need to convert it using before create hook check the code below

`const dauria = require('dauria'); module.exports = {

before: { all: [ ], find: [], get: [], create: [

async context => {

  if (!context.data.uri && context.params.file){
    const file = context.params.file;
    const uri = dauria.getBase64DataURI(file.buffer, file.mimetype);
    context.data = {uri: uri};

  }

return context;

} ], update: [], patch: [], remove: [] },

after: { all: [], find: [], get: [], create: [], update: [], patch: [], remove: [] },

error: { all: [], find: [], get: [], create: [], update: [], patch: [], remove: [] } }; `

on yourservice.hook.js

then you are done for more detail check

https://docs.feathersjs.com/cookbook/express/file-uploading.html#basic-upload-with-feathers-blob-and-feathers-client