TryGhost / Ghost

Independent technology for modern publishing, memberships, subscriptions and newsletters.
https://ghost.org
MIT License
47.25k stars 10.3k forks source link

Server-side image upload refactor #635

Closed ErisDS closed 10 years ago

ErisDS commented 11 years ago

At the moment, Ghost image uploads are thrown straight at the local file system. This is great and exactly what we want for standard Ghost - no config of an external system required.

However, it does pose a problem for hosting. Most node hosting platforms do not persist the local file system, and this will be true for our own platform.

The server-side part of accepting and handling uploads, along with the retrieval of images needs to be refactored such that the local file system handler is a module and is the 'default' image upload handler.

The handler should support upload/post requests, retrieve/get requests, and delete requests, but not modify. The handler should store the files in an immutable fashion, using a unique filename (closing #619).

Ghost should then be wired up in such a way that if another handler is configured it will use that, but otherwise it will use the local file system handler by default for all requests to store or retrieve an image (there's no concept of delete in the UI yet).

We will also need to implement a different file upload handler, but the details of what it should do, and whether or not it should be core or a plugin are not yet confirmed.

jamesbloomer commented 11 years ago

A few approaches could be taken.

1 - The easiest is to assume that any images will always be available via a normal url. This would be applicable for the local file system or something like S3. In this case the retrieve method doesn't need to be implemented, it's just a url. The upload method needs to put the file wherever and return the correct url, the delete likewise. In this case the admin controller upload function needs to forward to the handler.

2 - If however the uploaded image is not available via a static url, maybe in a database somewhere, then the image will have to be streamed down to the client. The difficulty here is to try and enable some caching and the load on the server may be higher, see here for a discussion.

3 - Also a hybrid approach could be taken, using a dynamic route in express, then using res.sendfile to return the file. This has the advantage of allowing some dynamic logic but also caching, see res.sendfile. Although with this approach it looks like the file still has to be on disk for express.

The first approach is the easiest but makes some assumptions that the files will always be accessible in a certain way.

jamesbloomer commented 11 years ago

Anyone else started doing anything with this? Got time to start extracting the local file system storage, can then push to a branch for other people to work on.

ErisDS commented 11 years ago

No one has picked it up just yet, I can create a branch and pick it up from you :+1:

jamesbloomer commented 11 years ago

Sounds good.

ErisDS commented 11 years ago

Branch is called image-refactor - just so it's easy to find ;)

ErisDS commented 11 years ago

Progress on this issue exists on the image-refactor branch. There is a somewhat related issue open #937.

matteocrippa commented 11 years ago

'll try to give my 2 cents about this, probably you will have to point out the two most common situation when Ghost is hosted in a not writable fs hosting:

Dunno if you plan to have this already in place now, but probably it would be interesting to allow user to pickup their own way allowing to use a plugin to re-route in the best way their uploads.

Eg. (I admit prob I don't know if this is still allowed) Some user may prefer to use Dropbox to store the public attachments.

jamesbloomer commented 11 years ago

Almost ready to push the S3 storage version, all pretty straight forward. Undoubtedly there will be other types of storage that people want to use.

hugo commented 11 years ago

Currently it looks like Ghost will act as a relay for any image storage service i.e. the image will be uploaded to Ghost and then to S3/Azure/etc. Is the plugin system flexible enough that eventually this could be refactored to enable direct upload from clients to file storage services (S3, Azure and GCS all provide this) without hitting the Ghost server.

Things to worry about here that I see (probably there are more):

[1] You can get the hash from S3, but then you have to relate hashes to images somewhere, so you've added a lot of work and an extra database field, a least, to avoid a hopefully uncommon situation of double uploads of the same file.

[2] You have to get the filename from the client, check with the file store if it exists, or what the next available increment-appended name is, get an endpoint URL, trigger the upload from the client, verify the upload completed when the client says it did, and store the final URL somewhere.

ErisDS commented 10 years ago

Wow. This issue is being referenced to death :astonished:

Just wanted to drop a note on the end of here to say I'm currently doing further work on abstracting the filesystem in Ghost. The intention is to remove the assumption that we use the local file system, and instead use a module which proxies to the local file system if nothing else is defined.

Expect to see stuff change over the coming days.

ErisDS commented 10 years ago

Consider this complete for now.

jamesbloomer commented 10 years ago

Depending on what your definition of proxy is on the previous comment... When I started this I had a look at using Ghost to proxy to an external storage location eg. S3, rather than expose the image directly from S3 as a link. It's definitely possible but has the downside of double the bandwidth, the image from S3 to the Ghost server, then the image to the client. Worth bearing in mind.

thecolorblue commented 10 years ago

Is anyone else working on another storage method? I might start on implementation for modulus.io.

ErisDS commented 10 years ago

We haven't finished work on our App boilerplate yet, so just now there is no way to override the storage method other than hacking Ghost. Once the App stuff is officially launched adding this kind of thing will be MUCH easier, you can see progress in #1474.

bioball commented 9 years ago

Has any progress been made on this front?

ErisDS commented 9 years ago

Yes, see https://github.com/TryGhost/Ghost/wiki/Using-a-custom-storage-module

bioball commented 9 years ago

Ah great, thanks!