keystonejs / keystone-classic

Node.js CMS and web app framework
http://v4.keystonejs.com
MIT License
14.64k stars 2.21k forks source link

StorageAPI for Images and Files #526

Closed grabbou closed 5 years ago

grabbou commented 10 years ago

Draft

To store Images, we have to use CloudinaryImage if we are on Heroku and so on or if we like ability to resize or manipulate our image. Not everybody is happy with paying monthly-fee for having only those features. Although Cloudinary is great - not everyone is using all of its features. That's why we need to kick our Image and Files to next level and make it like in other CMS systems.

Storage API

REPOSITORY: https://github.com/grabbou/StorageAPI/

Will be released as stand-alone NPM package and after that, integrated into Keystone.

Deliver Keystone.StorageAPI for developers to easily use storage (when available) in their projects. Storage API can be used either in fields or in our own views and modules. It's public. Define once, use multiple times.

Directory structure

providers/
providers/storage/index.js // this is place for our 'class' **Storage**
providers/storage/....
providers/storage/amazon.js //name 'amazon' now is available as keystone.set('storage')
providers/storage/azure.js

// once we finish, it's time to write mailApi
providers/mail/index.js // mail resolver, works like storage one as well (obtain() method)
providers/mail/postmark.js
providers/mail/mandrill.js
providers/mail/smtp.js

Storage

keystone.set('storage', {
   'azure': {},
   'amazon': {}
});

Please note, that async methods are wrapped within promises (Q library) to avoid callback chain and provider better functionality

List of supported providers to start with

Removing multiple fields that are doing the same job and replacing them with following ones. We are going to delete ui-azure.js, ui-s3file.js and many many more (with CSS3 stylesheets as well! Going simple now!)

File

Simple field for handling files. Accepts any storage (uses default one specified in the project). Returns false if uploaded file doesn't match allowedTypes.

Schema:

Accepted fields:

Available hooks:

Generic ImageField for uploading & handling images. Deletes CloudinaryImage as it's now built-in within this field. The same with other image-related fields.

Schema:

Accepted fields:

Underscored methods

Available hooks:

Removes hard-coded CloudinarySupport. Allows to upload file to default storage provider when available. Allows overwriting that by specifying keystone.set('wysiwyg storage'). To enable image upload - call keystone.set('wysiwyg image upload'). It means you can now upload your wysiwyg images easily wherever you want. Either to FTP or Amazon S3. Want to resize your images while uploading? Easy! Just set keystone.set('wysiwyg image options') and we will adjust your input using ImageField undescored methods. No more resizing before uploading high-res photos.

Breaking changes

jamlen commented 10 years ago

One feature that I use from Cloudinary is upload a PDF and they generate a thumbnail. This is really useful.

grabbou commented 10 years ago

Thanks @jamlen for sharing your idea! Actually I agree that's pretty useful. Maybe we can create 'middlewares' for File Uploaders, if option generateThumbnail is set to true, we will try to generate image thumb for our file and save it along the original one.

estilles commented 10 years ago

@grabbou and @JedWatson, does this mean we're dropping support for Cloudinary in 0.3.x?

grabbou commented 10 years ago

@JohnnyEstilles not AFAIK. I am only developing new ImageField that will handle Local & Cloud images (upload/get/resize). A kind of free alternative if you already got hosting. In the future, we will be able to add ImageFile.FTP as provider as well. Cloudinary will remain untouched :)

As I have a deadline for tomorrow with that feature, will try to finish at least Amazon integration. You can check out my fork (branch feature-imagefile) for progress. For now, I've copied AmazonFile files - quite unhappy with the fact that I had to redeclare jQuery and templates as well. Probably in the final case, I will use the same as Amazon/S3 that are almost the same.

estilles commented 10 years ago

@grabbou thanks for the update. I'll check out your branch. BTW, I love the concept of Types.Image being an extension of a simpler Types.File. The idea is brilliant.

grabbou commented 10 years ago

@JohnnyEstilles thanks! Great to hear that!

yinear commented 10 years ago

We have a large number of pictures and don't want to pay cloudinary for that. I wish local images can be added in wysiwyg and managed as a Field. Thanks for your effort.

grabbou commented 10 years ago

Hi @Yinear, although my current branch is a bit awkward, all those capabilities are handled now by Amazon (we had to migrate our CMS before today). I will try to finish that module today/tomorrow and in final version, it's going to work right that :)

JedWatson commented 10 years ago

Not sure if you saw #165, there were a few other issues where this has been discussed but you're basically nailing it, @grabbou. I need to run for a couple of hours but will write down a full summary of my thoughts on this tonight.

grabbou commented 10 years ago

Cool, thanks. I haven't heard of them, it's good to close few of them by accident! We can catch up later today. I will start working on that probably around 8-12pm. It seems that we are gonna have busy time with writing new modules, if you can mail me using my email address listed in my profile - that might be speed up our work.

Important Topic updated, removed my comments and moved everything to first comment, so it should be clearer now.

JedWatson commented 10 years ago

Great spec, @grabbou. My ideas / questions:

Images

How are you planning to implement the cloudinary-like resizing features? I love the idea, but had planned to do "upload-time" generation of images from the source, based on a definition that would live with the field (e.g. multiple keys w/ sizes, crop methods, etc).

Emulating the on-the-fly image resizing from source (when missing) would definitely be cooler, but would also (I imagine) require either middleware to route the request for the image via a generator that integrates with the storage engine while serving the request, or a callback in the underscore methods (as checking for the size, and generating it if necessary, would be async (which the cloudinary methods are not, as all they do is generate a URL).

I've got more ideas on this, but first wanted to get a sense of how you're planning for it to work end-to-end.

Cloudinary

I was originally going to leave the CloudinaryImage and CloudinaryImages field types in place, because I do expect our internal Image API to diverge from Cloudinary's. I'm happy to have one or two exceptions like this; I think "this other service processes your images" vs "keystone processes your image and stores it somewhere else (fs / s3 / etc)" are fundamentally different problems to solve.

Especially because CloudinaryImage/s fields are so widely used at the moment, I recommend we build this alongside them, and focus on getting our generic Image and File fields correct. (remember, we'll have to do both single and multi-file/image variants).

Promises

Please note, that async methods are wrapped within promises (Q library) to avoid callback chain and provider better functionality

Generally, without taking either side of the async / promise preferences, I've kept all external Keystone APIs async and follow standard node conventions (error first callbacks, etc), using the async library as necessary to keep code clean.

I'd be happy to offer a dual-api (as Mongoose does, where you can fire off async functions by passing a callback argument, but they also return a promise) but don't want to split the API into promises / async.

My main reasoning for this is that it's quite easy to promisify async methods anyway, if authors prefer that api, and this keeps Keystone more consistent with core node / express APIs, so there are less things to learn for newcomers.

I've found new users have trouble distinguishing the boundaries between what's Keystone, what's Mongoose and what's Express, which has reinforced the importance of a consistent API for me (something I'm a bit guilty of breaking with some early decisions, as concepts have evolved).

Dealing with Breaking Changes

Given the breaking changes in this, we'll need to publish it as a minor version bump (technically anything can break on 0.x.x versions in semver, but I like to go with "patch version bumps fix / add, minor version bumps break". So ideally we can line up these changes with the move to React, which I was going to publish as 0.3.0 (since some people will be running custom builds / hacks that reworking the Admin UI may interfere with).

In the meantime, I'd like to keep master safe for updates to the 0.2.x versions, so I'm aggressively merging changes in master into my react branch and trying to limit refactoring in react until it's ready to merge.

Alternatively we could focus on adding new File and Image types, then Files and Images, merge them in the 0.2.x / master branch as we go, then remove the redundant field types in the 0.3.0 cleanup. Might be less disruptive that way, and the two tasks won't be dependent on each other.

I suspect the File and Image fields will be quite divergent (except for wrapping the underlying StorageClient layer) with File being much simpler... do you think it is worth splitting this into two tasks / issues, and tackling File first to get the basics down? Or do you want to tackle it all at once?

Finally on breaking changes, with the options that we're replacing (like s3 config), I'd like to deprecate these but have them pass the config through to the new system anyway (maybe with a warning to console.log). Much like @JohnnyEstilles with the List.addPattern deprecation in #523. Remember we'll want to handle compatibility with process.env keys.

Refactoring the UpdateHandler

This has been on my mind for a while, I've opened a new issue for it specifically (#533) because it can / should be done independently of this. I'll jump into it soon, if you don't get to it first ;)

Finally, this is something I've been wanting to do to Keystone for quite a while, thanks for tackling it so comprehensively! :D

grabbou commented 10 years ago

Hi!

So, basically, in answer to your question:

Images

My initial idea was to provide users with underscored methods, let's say _resize(400, 500); The workflow would be (synchronous):

  1. In jade template, there is our _resize method, jade calls it.
  2. Our method checks using Storage API whether image exists on the filesystem.
  3. If yes - it returns that URL
  4. If not - it downloads the image (if necessary) / resizes it and uploads it again
  5. Returns new URL

I quite don't like the point 4 and the fact that we will have to download high-res photo, but the initial idea is to save them in the cloud in untouched way, so then, we can modify them without uploading them again. They might be prefixed with _original<name>.jpg or sth.

Upload-time manipulation is better in a way that the first page load will be faster, but in case we are not satisfied with the process, we will have to re-upload it again, which might be confusing if we got dozens of images already uploaded.

Fact, I agree that we can leave Cloudinary untouched. This will give us more flexibility in terms of implementing our own image manipulation methods without breaking current images.

Files

Ok, that's not a problem. We can follow that pattern (at least I am a huge fan of err,callback pattern as well). After giving it a thought for a long, I think that's actually better.

Breaking changes

Well, personally, I'd rather prefer the style of doing everything all together. Of course -> we will split that in smaller steps with very small informal deadlines. The initial process will be probably like this:

  1. Storage
  2. StorageClient (Amazon as I got credentials)
  3. File
  4. Connecting Field & Storage (writing obtain() method)
  5. Testing if File works
  6. Image as extended File
  7. Wysiwyg tweaks
  8. Underscored image methods

Not sure if they will be divergent, I am going to use File as a base, as you can see from my spec, Image inherits almost everything from File :)

OK, so -> we are changing keystone.set methods, but keeping proccess.env variables the same. We can then add wrapper in Storage checkDeprecatedConfig that will run our set-up but with warnings.

Ideally, I would love to have at least kind of 'limited' repo access, so I can create branch here, called feature-storage and work on it continuously. After that -> we are merging with yours React? I think that's easier and more safe. If we are going to have breaking changes anyway, let's to that only once.

Cool, thanks for that issue. I will probably help you on refactoring that.

What about (for the future) moving few fields and methods to Field (making them 'abstract') as I pointed before?

You're welcome! Thanks for that superb CMS. I felt in love with it from the very beginning and it's fun that I can finally contribute to something I really like.

I am going to start writing points 1 & 2 today, so the most important thing is to sort everything out before that :) Probably we will have few issues and suggestions while writing that :)

webteckie commented 10 years ago

I'm having a hard time following through the requirements set forth here that will be addressed via the storage API. Simple question: currently there is support for LocalFile, S3File, and AzureFile. Ideally I would like to be blind about the underlying storage system and just deal with a File. That way, when I'm testing things locally it will use my local file system and when I deploy out there to some cloud it will use whatever file system transparently. Will that be possible with what's being defined here?

grabbou commented 10 years ago

Yes, @webteckie, basically, StorageAPI works like the way you'd like it to work.

It's all about configuration. You can use default available provider by just accessing one Storage method that will give you current client instance.

If you want to use LocalSystem on your localhost and e.g. S3File on your deployment server -> simply configure cloud credentials there and leave them empty on your localhost. Further details will be announced soon.

webteckie commented 10 years ago

Excellent! And, will there also be an embedded wysiwyg property as part of it so that when you view the file in the Admin UI a preferred wysiwyg editor is used to view its content?

grabbou commented 10 years ago

@webteckie you mean different templates based on extension of the file? (PDF/ZIP and so on)

UPDATE I've started working in separate repository https://github.com/grabbou/StorageAPI/ on our API. Few methods, basic structure and even an examples are there! We decided to release it as stand-alone npm module, it means that you will be able to integrate it within your different application as well. After that -> we are going to include that within Keystone & rewrite current fields. If you'd have any interesting thoughts - feel free to open an issue there.

Should be done within next few days. As Amazon integration is working now -> it's time to write some test cases and move on with another providers :)

webteckie commented 10 years ago

@grabbou actually never mind. I'm thinking a bit ahead assuming that we have support for the ACE editor (for which I have opened #544) . However, a wysiwyg property would be better integrated with the "code" field type where I would then mount ACE onto it. But also please keep in mind during your Storage API work, that the storage end point can also be just a DB (as well as the other file systems). For example, I may want to define a List that defines a JavaScript script and I want to be able to edit the script in the Admin UI as well as I need to be able to store the script either in the local file system and/or in the DB then I would envision something like this:

var JavaScript = new keystone.List('JavaScript', {
});

JavaScript.Add({
     content: { type: Types.Code, lang: 'js', wysisyg: {editor: ace, plugin: ace-js}, storage: {type: localfile | DB, ... } }
});

NOTE: But since you brought it up, do browsers automatically associate file extensions with applications? They probably would want your to download it and then open it. Can keystone associate extensions with applications? Anyways, my concern was more that which I brought up in the previous paragraph.

grabbou commented 10 years ago

@webteckie, I have few interesting thoughts on that, let me get back to you shortly later today or tomorrow morning.

grabbou commented 10 years ago

@webteckie, if you can create an issue here https://github.com/grabbou/StorageAPI about adding support for mongodb, that'd be cool. I will add that provider as well. You will simply need to use Storage.obtain('mongo') in your application to 'upload' file to your collection. If you have enough time and feel comfortable with my codebase, feel free to write your own provider there (take a look at AmazonClient and unit tests to get an idea how it works)

In terms of allowing multiple storages for fields, it's not related to StorageAPI by itself but to field.code that we are going to refactor after finishing that API so I've just added that point to my first comment here where the brief description of structure and functionality is located.

On a side note -> I've just finished writing tests and added multiple providers. I am going to add FTP/SFTP quite soon and we are ready to go.

webteckie commented 10 years ago

@grabbou done. Thanks!

grabbou commented 10 years ago

We are moving on quickly! For now -> almost all major providers are done. I am working on unit tests right now to make our code coverage as close to 100% as possible. I hope that we will start integrating that into Keystone within next few days. Any feedback and code reviews are appreciated, feel free to contribute and comment currently opened Issues!

grabbou commented 10 years ago

Everything is done, docs are updated, almost all of the features are present. Only 2 issues to start integration with KeystoneJS.

Repo -> https://github.com/keystonejs/storage.js

yinear commented 10 years ago

Excellent!Can this be used in the next keystone release? When can we update our keystone?

grabbou commented 10 years ago

Ok, we can say that StorageAPI reached its RC stage. Two providers - HP & Azure are disabled for now as they are not supporting public file upload. We are working with pkgcloud guys to add new features to allow CDN usage of those buckets. Instead of that, I've added Dropbox integration.

Now I have to adjust current file types that I've wrote recently. I hope the testable version will be prepared by the end of the week.

yinear commented 10 years ago

That's great!~ Well done!~

grabbou commented 10 years ago

Here's the URL to my fork where I am working on it -> https://github.com/grabbou/keystone/tree/feature-storage

Currently not so much to show but I am about to finish integrating StorageAPI tomorrow.

On a side note -> If anybody has enough free time to visit StorageAPI repo and test it (maybe use inside your project - that would be quite handy).

morenoh149 commented 10 years ago

@grabbou what needs testing? I cloned it.

grabbou commented 10 years ago

Providers - all of them. I am stuck with doing other things but hoping to finish that this week. There is a list of all the providers available in the Readme.md so that'd be useful if you can go through some of them :)

danielmahon commented 9 years ago

@grabbou Ive been working on a Transloadit File FieldType which I pretty much have done, when I came across this storage api, which I think is a great idea but Id like to be able to still utilize it as a field type. How do you suggest to proceed so Im not working against yours and Jed's plans? Thanks.

danielmahon commented 9 years ago

@grabbou The BIG thing that I also need to be able to support is a public facing API for db updates when the file/files are done processing.

sebmck commented 9 years ago

@grabbou Any update on this?

morenoh149 commented 9 years ago

@sebmck still pending testing. Will probably be addressed after express 4 and react updates are released.

sebmck commented 9 years ago

@morenoh149 Yep, for sure. Storage.js hasn't been updated since September and it'd be good to get a roadmap going so knowing the current status is pretty important.

grabbou commented 9 years ago

@sebmck @morenoh149 Yep, hasn't been updated, mainly because I was sick and changed the country in a meantime. Also - there were few issues with underlaying libraries waiting to be merged into their master branches. For now I'll probably focus on more important features with other team members and then we will carry on storage.js development all together (doing something alone is pretty boring :( ).

sebmck commented 9 years ago

@grabbou Keystone has come pretty far in the last few months, we've since had our first community meeting #812, would be great to see you at the next one.

JedWatson commented 9 years ago

@grabbou nice to see you back, hope you're better! we're nearly ready for the major new release with React and Express 4 so we'll pick this up after that goes out :)

grabbou commented 9 years ago

@sebmck Yeah, I can see that. I am glad it keeps developing. I think I'll speak with one of contributors on Slack to get an overview what's going on at the moment. I am in for our next Hangouts meeting.

@JedWatson Great work on React! Sorry for being away. Let's start contributing once again! :)

okjake commented 9 years ago

Any further progress on this? Can I help with it?

JedWatson commented 9 years ago

@okjake I'm so glad you asked. I've been busily wrapping up the rebuild of the Admin UI and this is the other major blocker to the new version, I'd really love someone to take it on and help with it.

Things have moved on quite a bit since @grabbou's work, and are encapsulated in the file-fields branch. It's quite out of date so I'll have to get it brought up to speed, and figure out what's required to complete it. @creynders may be able to help organise the effort as well, he did quite a bit of work with it last. Let me know your email so I can add you into our Slack and we can chat.

okjake commented 9 years ago

OK great! Just added my email to my profile.

JedWatson commented 9 years ago

Great, you're invited!

mahnunchik commented 9 years ago

Hi @JedWatson

Any progress on this?

I need to implement something like this:

It is quite difficult to implement it on the current implementation of LocalFile.

christroutner commented 8 years ago

@JedWatson I'd also love to help with this. @okjake If you can take a half hour to bring me up to speed with what you've accomplished, I can dedicate a couple hours a week to help with the integration of this. My email is available in my GitHub profile.

jeffreypriebe commented 8 years ago

@mahnunchik @christroutner Happy to help with this effort as well. A little more discussion over on the Google Group post. I currently have a branch that does some integration of cloundinary & s3 images/files into the main editor and want to see the file & image management in Keystone grow into a more robust system. (My work uses React and pre-integration Elemental and isn't setup as something to be folded into the main Keystone project ATM)

Let's get on the Google Group Post for continued discussion. cc: @okjake @JedWatson

rnjailamba commented 8 years ago

@jeffreypriebe @christroutner @JedWatson @grabbou How are things going for this feature ? :)

rnjailamba commented 8 years ago

@jamlen
a code snippet of using cloudinary pdf would be VERY helpful!!!

jamlen commented 8 years ago

@rnjailamba from memory I think the main bit that does it is presentation.src({ transformation: 'mottram-thumb'}), where I am referencing a predefined transformation, but you could just specify the transformation details.

I use this in my model:

Sermon.add({
//...
presentation: { type: Types.CloudinaryImage, collapse: true, allowedTypes:['application/pdf'], autoCleanup : true }
//....
});

Sermon.schema.virtual('thumbnail').get(function() {
    if (!this._.presentation.src()) {
        return {
            img: null,
            isPdf: false
        };
    }
    return {
        img: this._.presentation.src({ transformation: 'mottram-thumb'}),
        isPdf: this.presentation.format.toLowerCase() === 'pdf'
    };
});

and then this in the jade:

if sermon.presentation.exists
    if sermon.thumbnail.isPdf
        a(href=sermon.presentation.url).sermonImg
            img(src=sermon.thumbnail.img)
    else
        img(src=sermon.thumbnail.img)
else
    img(src='/images/nopdf.gif')
LuganOcode commented 8 years ago

Hi, I hope this is the right place to add this, I'm curious about whether or not it is possible to setup the S3 storage so that actual file uploads are performed not by keystone in a:

The application I am building performs a lot of file uploads, and in the past I have used this signing strategy in other application frameworks with good success.

Below is a more concrete snippet of what this means.

In my experience this can quite significantly reduce load at no loss, especially since Keystone only stores the metadata anyway.

I'm sure I can tinker my way to this, but maybe there is already some kind of support for it that I am missing?

JedWatson commented 8 years ago

@LuganOcode this is something I'd like to look into for all the file-type fields once we've consolidated them, but that's looking like a post-0.4 thing. We need to get the foundations fixed before we build more features on top of them.

What will be available though is the ability to pass the metadata directly to Keystone so in the meantime (with 0.4) you could handle the upload to S3 directly with an extension to your app then submit the metadata to Keystone.

LuganOcode commented 8 years ago

@JedWatson Thanks for the tip. We are already using 0.4 beta - the React UI stuff was just too nice to pass up. So one more hint then, if we go in the direction that you suggest, would this imply essentially not using the 'S3FileType', and instead just just using a custom model object, or you would instead mean using the standard S3FileType and just processing submissions with the metadata but without the actual files?

Since this does seem to be the/an appropriate place, I'm going to point out one more small hiccup that we ran into with file uploads and the updateHandler. In our application we are asking users to create A/V files. This forced us to custom-configure the _uploads because:

  1. Client side javascript does not permit us to add a filetype input to an HTML form object programmatically. The user must do it. The only alternatives we could think of were to:
    1. Use a FormData object, and submit with an XmlHTTPRequest which means we screw up context handling and flash messages, among other things.
    2. Append the file as a blob in a hidden input.
  2. The Keystone form processing API does not know how to process a blobfile that has been passed via a hidden input, I think because it is not in the 'files' section of the request.

In the end we just moved this work into the view controller, and replicated the work otherwise handled by the updatehandler (and I think ExpressJS is also maybe handling the temporary storage), but this is probably not the ultimate recommended way to handle such issues.

BTW this is a quite amazing thing!