Closed vsivsi closed 10 years ago
We haven't tested this in awhile, but it should work with a PUT
to the same URL, minus the store name (http://localhost:3000/cfs/files/filesColl/zjAqp9ceEZPeezyWy). Obviously this requires that you first insert the file into the collection.
We probably should insert the file into the collection and then store the data when params.id
isn't provided. Then you wouldn't have to necessarily do your own insert
first.
The expects binary data
message might be because we're checking to see if it's Uint8Array, but it's probably a node Buffer when done through HTTP. I'll see about a fix.
Yes, I can verify that if I do:
curl -X PUT -T image.png 'http://localhost:3000/cfs/files/filesColl/qjAqp9ceEZuc56XWy/'
Where that file already exists, I get back:
Error in method "/cfs/files/:value/:value/:value/", Error: Error: APUpload expects binary data
at Object.APUpload (packages/cfs-access-point/accessPoint.js:5)
at Object.<anonymous> (packages/cfs-access-point/accessPoint.js:239)
at packages/http-methods/http.methods.server.api.js:417
We probably should insert the file into the collection and then store the data when params.id isn't provided. Then you wouldn't have to necessarily do your own insert first.
How does any kind of authentication happen in this case? For GET/PUT/DELETE with a bare URL like this, it depends on the obscurity of the file id
(which you presumably had to authenticate to create/learn). Is there some way to pass the current user token in the HTTP header?
My use case for this question is a "worker" process on the server side (though perhaps not on the same machine) that is connecting via DDP to get "jobs" that require heavy background processing (similar to what Kue does). It's straightforward using DDP to authenticate a worker and have it use method calls and subscribes to receive and communicate status about tasks. What's missing is a way to pass large files back and forth between the worker and server. Simply supplying HTTP URLs to the input and output files for the worker to GET/PUT would be a great solution.
Your worker process could also insert the files and then chunk upload using DDP, which is what CFS does by default right now. That said, we're probably going to switch to HTTP for uploads because there are some performance issues with using DDP for binary data.
HTTP uploads is something we've designed it to support, so we just need to fix this issue you're getting. Still investigating.
OK, @vsivsi, I pushed some support for this. It still needs work, but it's at least working now.
To upload a new file:
$ curl -X PUT -T "My Test Picture.png" -H 'Content-Type: image/png' http://localhost:3000/cfs/files/filesColl/
You need to add the content-type header, but otherwise what you were trying before is correct. Don't include an ID.
If you do include an ID, it should theoretically overwrite that file, but I didn't test that and I'm guessing it's not working correctly yet.
Great! Thanks, I'll test it out in the morning.
A followup to my question above regarding authentication over HTTP: I notice that if I'm logged in, the {{url}} helper generates URLs like this:
http://localhost:3000/cfs/files/filesColl/NvmYH63pTfijuccXa/filesGrid?token=5QRGg8pgk8rRmswfL
With the ?token=XXX
parameter. Is this the mechanism for passing the token for all HTTP requests?
In testing this for HTTP GET, I'm noticing that it doesn't seem to matter whether I provide the token or not, even though I have deny
code like this in effect:
filesColl.deny({
download: function (userId, file) {
if (file.metadata.owner !== userId) {
return(true);
} else {
return(false);
}
// And so on for insert, update, remove
});
The update
, remove
and insert
deny rules seem to work correctly within the browser. But the download
deny never seems to have any effect on the HTTP GET method. The documentation on this is a little thin, so I may be doing it wrong. Thoughts?
Too impatient to wait for morning... Your change does work for me for uploading a new file, thanks so much!
I can also verify that the allow/deny rules do not currently seem to be enforced on HTTP requests. The curl -X PUT ...
example succeeds even when I have the following rule in place (that successfully prohibits inserts from the browser via DDP):
filesColl.deny({ insert: function (userId, file) { return(true); }});
I totally understand that this is under development, but it might be worth a prominent warning in the devel
branch docs that the HTTP access to CollectionFS (which is enabled by default) is currently wide open.
One bit more information: The allow/deny functions don't even get called for HTTP requests.
I thought auth was working for me. Do you have insecure pkg removed? I'll test again later today.
I just verified that the allow/deny functions are correctly called for GET and PUT, and when I have deny return true
, I get a Forbidden/Access Denied response. I didn't double-check to see if the userId is correctly set, but if you pass the token=
query string param, it should be.
@aldeed I'm thinking about a way to sign / auth http via the ddp connection - maybe have a way of signing / crypt the user data - The current http.methods implementation does a db call to look up the user to validate - this info could be extracted from the crypted user data - avoiding db access. (this pattern could result in extracting fileworker+http access point and storage adapters into node.js packages - making it a more separated structure - just saying)
@aldeed, I am running with the insecure package. Although that doesn't really explain what I'm seeing, because according to the Meteor Docs, insecure should have no effect if any allow/deny rules at all are declared on a collection:
Meteor also has a special "insecure mode" for quickly prototyping new applications. In insecure mode, if you haven't set up any allow or deny rules on a collection, then all users have full write access to the collection. This is the only effect of insecure mode. If you call allow or deny at all on a collection, even Posts.allow({}), then access is checked just like normal on that collection.
I was setting deny
rules and they were being respected on my collectionFS when accessed via DDP, so this behavior of: "ignore access rules for HTTP requests when insecure
is installed" is unexpected and seemingly incorrect.
tl;dr It shouldn't matter that I am running insecure because I defined allow/deny rules on this collection.
That's probably our fault for implementing it incorrectly then. Also, I realized that there might actually be an issue because I was testing with a local codebase that was missing some recent changes. I'm looking into it.
Okay, I've removed the insecure package and tried again, and now I'm seeing a different set of issues related to the userId
values being provided to the allow/deny rules.
Here's the setup: I've defined a set of allow
rules for each of insert
, remove
, update
, and download
. When logged in to the client, if I set all of the allow
rules to return true
everything works great.
But if I add a simple ownership test ala file.metadata.owner === userId
then all hell breaks loose.
Here's what I see happening in this case: correct userId
values are passed to my allow
functions for insert
and remove
, but userId === null
for update
and userId === undefined
for download
.
The impact of this is that when I upload a file, the initial collectionFS document is created (the insert
succeeds) but the subsequent update
to add the chunks
and copies
attributes fail and the upload fails on the client with an exception Error: "Queue" Access denied [403] [403]
.
If I relax the update
allow
rule by setting it to always return true
, then uploads succeed. However, downloads via HTTP still fail because the download
rule isn't being provided with the valid userId
.
Removing the file via DDP works correctly, with the correct userId
value provided.
I think the issue you're seeing now is the same as #166. Something about our code that is "logging in" on secondary DDP connections isn't working anymore. Is this on Meteor 0.7.0.1 or the new RC release?
To add to my above comment: the undefined
userId
being passed to the download
rule is coming from an HTTP GET with the token=XXX
query parameter correctly set.
I'm running this on the latest commit of the shark
branch, which has a bunch of stuff post v0.7.0.1 merged into it. Specifically I'm running https://github.com/meteor/meteor/commit/1d24565fb5ed6a5571cd6c037f1a60a092be8eaa
Agreed that https://github.com/CollectionFS/Meteor-CollectionFS/issues/166 probably explains the update
issue I reported above. However, I think the HTTP GET download
issue is probably distinct since it is getting a different value (undefined
vs null
) and that case doesn't involve DDP.
I just pushed some changes (includes slightly different URL syntax, too, so beware). In my testing, userId is coming through fine with HTTP GET and token
. If it's still not working for you after taking the latest, then there must be something on shark
that's different.
@aldeed, May I make one more (hopefully) relatively simple request related to the HTTP PUT functionality you got working for me last night?
As of right now, the PUT works, in that it creates/uploads a file. But what is missing (for my use case at least) is some mechanism for the worker to know which uploaded file is the results of its work (I'd prefer not to rely on the uniqueness of filenames). Right now, once that PUT happens, the worker/client is in the dark about where the file went or even if the upload really worked.
So what I need is either:
_id
of the newly created file (the reply is currently empty). _id
.Arguably both of those should work at some point, but I'd be happy with one or the other for now.
Also, the #166 issue is fixed, or more accurately, "worked around".
Re PUT, I have the HTTP method returning {_id: file._id}
, but doesn't seem to be coming through. There's probably some different way to do it that would work. Maybe I have to stringify it? I'll have to investigate, or @raix probably knows.
I just tried everything again using the latest versions of all collectionFS components and there are some improvements. The update
allow rule now gets the correct userId
, and the upload succeeds. Yay!
I do see an error in the client console Exception while simulating the effect of invoking '/cfs/files/put'
and the error Object contains a Access denied [403]
. This only occurs in conjunction with an upload.
Once the file is uploaded however, I'm still seeing 403 errors on HTTP GET requests caused by userId === undefined
in the allow download
rule function. The URL I'm using (provided by {{url}}
is now:
http://localhost:3000/cfs/files/filesColl/filesGrid/universe.png?token=XXXXXX
OK, I "desharkified" my test app and tried this out on both Meteor v0.7.0.1 and the new v0.7.1.0 RC2.
It works as you describe on v0.7.0.1 and fails identically as it did on shark
with the RC2 build.
So whatever changes cause it to fail on shark
are about to be released when v0.7.1.0 drops.
I've found and fixed the HTTP authentication problem in my local repo. The issue is due to changes in Meteor involving the hashing of active tokens in the user db. A backward compatible fix (works for both 0.7.0.1 and 0.7.1.0 / shark ) involves a single line change in Meteor-http-methods / http.methods.server.api.js.
Use this line instead:
var user = Meteor.users.findOne({ $or : [{'services.resume.loginTokens.token': userToken },
{'services.resume.loginTokens.hashedToken': Accounts._hashLoginToken(userToken) }]});
Ha, I just made that same change before I saw your latest comment. Wish I had seen it first as it would have saved me some hunting. :) Anyway, the http-methods package is fixed now. Thanks!
@raix, FYI.
@aldeed,
tl;dr: I think there is a compelling case to revert to using _id
in URLs instead of key
, and to strongly consider using _id
as the storage adapter key
as well...
Okay I've been testing with all this and I've run into an issue regarding the URL format change you recently introduced (now using the file's key
instead of the id
in the URL)
The issue arises in that filenames in a CollectionFS collection are not guaranteed to be unique; in essence they are just another bit of metadata. In my testing, a CollectionFS is happy to accept any number of identically named files (which is the correct design IMO). The way this is managed internally (given that both GridFS and the OS filesystem require uniquely named files) is that CollectionFS maintains distinct entries for name
(non-unique metadata) and key
(must be unique, used as GridFS and/or filesystem filename
).
The current implementation attempts to use name
for key
and resolves file key
collisions with a simple scheme that adds an incrementing number to the base name of the file (foo.bar
, foo1.bar
, foo2.bar
...) And herein lies my issue:
If I use HTTP PUT to insert a file, I know the name of the file I added, and in theory I can construct the URL to GET it back. In practice, using this approach, I'm not at all guaranteed to get the same file back, because if I insert foo.bar
and a file with that key already exists, then my new file's key will be foo1.bar
unless that already exists... And this will be even more problematic in the face of a working implementation of HTTP DELETE (or PUT that can replace an existing file).
Having thought about this for a day or so, I'm pretty convinced that the only sane way to manage this is to return to using the guaranteed unique and unchanging CollectionFS file _id
for all HTTP URL operations. In addition, it also seems more sane to just use that _id
as the Storage adapter key as well (retaining the extension if desired), and dispense with the "test and renumber" logic that adds complexity but little if any value.
@vsivsi, I agree with you mostly, but there is at least one gotcha you're not considering, which is that we intend for storage adapters to be able to optionally sync back to the FS.Collection and any other defined stores. That's where some complexity arises because we can't simply assume a file key/name based on ID will be unique. Nevertheless, I think you're on the right track.
These are some requirements as I see them:
put
job is to receive a file, store it, and return a "key" that uniquely identifies that file within the context of that store. Currently we pass in a "recommended key" that is used as the basis, but we of course must alter it to be unique. The only reason we currently pass the filename as the "recommended key" is because that makes it fairly easy to identify files if you're looking at them directly on the SA drive. We could in theory pass anything or nothing as the "recommended key".So then. Why shouldn't we use the store key in the GET URL? Because this fails one of the requirements: "should be constructable even if we haven't yet stored the uploaded file anywhere". Instead, as you suggest, we can revert to using the ID and store name to construct the GET URL.
What about adding the file extension for readability? Well, I don't think we can reliably do that because we don't know the final file extension until the store finishes saving the file (could be changed by beforeSave function). We could use the extension of the uploaded file at first, but that could lead to confusion if it changes and would mess with caching. Thus I'm afraid we won't be able to have an extension on the GET URL.
So I propose these URLs, probably similar to what they were before:
PUT (insert) sends back the new _id.
Regarding the "recommended key" and the final key: My current thinking is that we should pass the entire FS.File
instance to the SA put
method and let each SA do whatever it wants to construct a unique file key. In general, they would probably do file._id + "-" + file.name
and then alter that to be unique if necessary before saving it.
@raix, let me know if you disagree with any of this. This discussion seems to indicate that your idea about using the store key as the URL won't work. If I'm misunderstanding your idea, let me know.
@aldeed, Thanks for the detailed response. I'll probably have more questions/comments about the above after some time to think and peer into the code a bit more, but right off the bat one of your stated requirements puzzles me:
The GET URL needs to uniquely identify a file _in a particular store_.
Why is it necessary for the HTTP client to even be aware that there are stores? There are two related reasons I think this may be undesirable.
1) Philosophical: Leaky abstraction. You've all gone to great lengths to build up a consistent "Collection" abstraction here, only to allow a bit of "how it's stored" detail to leak into the picture for the client.
2) Practical: If I PUT a file: baseUrl + /:collectionName/:newfilename
and then want to use/communicate that file's GETable URL, how do I know what URL to construct? Clearly the PUT needs to return the id
, but in the proposed scheme, it also needs to return a valid store
, which seems to violate:
The GET URL should be constructable even if we haven't yet stored the uploaded file anywhere.
Beyond the philosophical objection to this above, it seems to me that store
need not be unique and/or static. This raises the prospect of GET URLs that become obsolete simply because some internal storage detail changed. It also implies that the client should have some say into precisely how a GET request is satisfied by the server (IMO, a detail better left to the parts of the system best informed to make that decision)
In short, it seems that the store
isn't strictly necessary to satisfy a GET request, and requiring it leads to a bunch of potential problems.
@vsivsi: Client knowledge of stores The client need to know what store you want to download - you could have a store that contains "thumbnails" or "original" etc. We did use the naming "copies" prior but its really a concept of having multiple stores attached - that said one could reuse storages adapters (SA) eg. have multiple stores use same SA.
Transfer http only The current client needs to know what store to use for initial upload - We want to allow the client to upload directly to S3 - then have it trigger a sync of the multiple stores. This pattern could change - I'm currently preparing to replace http upload (instead of using the current ddp) The reason for this is:
In reality we could simply have the http as default upload / transfer - the server would know how to handle an upload to a FS.Collection. It would also make it a bit easier to write / use libraries from ios/android etc. since http is widely adapted. So that said the client only needs to know what server and what collection to target.
We will prop. be implementing the s3 way of signing - it works well with Meteor. I've already created a working implementation - not released (to other than @aldeed)
Naming key/urls pattern Regarding the naming convention - we are working on a pattern - its going to be understandable - you could simplify it but it would come at a cost. as @aldeed correctly points out one of the patterns that I recently suggested has an implication by the extension. We are trying to abstract and isolate the code modules / packages into small black boxes. It will make it easier to extend and debug. Having 3 SA's already is really nice when debugging - its a great way to isolate problems and try to solve them.
The url pattern and key needs to consist of the following:
We also have the filename as the last one - this is actually causing problems with our architecture (trying to separate code in reusable packages) Now why bother with it? - one could say that from the user perspective its human readable - thats important too, right? Specially when/if the user wants to download the file locally
Future ideas @aldeed What I've been thinking about lately is making the use of filename optional - The SA will not have anything to say about naming conventions - It will be told the key - how it handles the actual storage is SA specific. Now the beforeSave on the other hand will have something to say when it comes to the extension - but thats about it - we can grap this on the fly?
collection/id // this is the minimum for downloading a file
collection/id/store // we select a specific store
collection/id/store/name.ext // this is the full key when the fileWorker adds a file to urls / stores
collection/id/name.ext // this is invalid - will never be used since urls contains the store too
the keys are actually the urls
- urls will have to be prefixed by the default /fs/files
or custom user defined mount point
Hi, Great work so far for CollectionFS. i agree that that the GET URL needs to uniquely identify a file in a particular store. It does makes sense and implement the naming pattern proposed by @raix. I just started using CollectionFS and in particularly cfs-s3, hence may not understand this fully. I do have some points (some 5 cents worth).
We solved these things in following pattern:
urls
(these are keys that link the files)urls
objectboth will result in a missing store in urls and the fileworker should recover - 1. is the mechanism also used when synchronizing - the SA will simply overwrite the file physically
I understand that creating a thumbnail is important but we are also going to provide a scheme making the image available in url even before uploaded - this is our latency compensation - again working in our private prototype.
@raix:
The SA will not have anything to say about naming conventions - It will be told the key - how it handles the actual storage is SA specific.
This is something I've tried, but it's not practical for at least three reasons:
.files
collection to do this mapping, but I removed it.sync
, too. Thinking of the packages as black boxes, the SA would not know what syntax is expected when passing a key to the collection for a file stored directly into the store and sync'd back.I'm really thinking that the most "black box"ish way to do it is to pass the FS.File
instance to the SA and let it use whatever info it wants to construct and return whatever type of key works best. Specifically:
file._id + "-" + file.name
and potentially append numbers to make it unique..files
and .chunks
collections and then return the .files
_id as the unique key.file._id + '/' + UUID + file.name
as the S3 key, doing a GetObject first to make sure that key doesn't exist yet.The only downside of this approach is that we cannot possibly have an accurate file extension on the URL. Or to do so, we would need to have the developer define a savedExtension: 'png'
option on each store that has a beforeSave
.
@vsivsi, regarding the client knowing about stores, @raix explained it pretty well, but to put it another way: After uploading a file, it no longer exists as a single file. Instead we store copies of it in one or more places (stores) and we potentially manipulate/alter those copies. The original file may not even be stored. For images, you might store thumbnail, small, med, large copies and then toss the original in case it was giant. For audio, you might store a 30 sec mp3, full length mp3, and full length wav, regardless of what the exact uploaded content type was. The point is that knowing the file info for the original uploaded file tells us nothing (necessarily) about what files are available for download.
@aldeed true, but:
filename
that equals 'collectionName/id/store/filename.ext' etc.The beforeSave would know about extension due to the api eg.:
this.gm().resize(50, 50).save('jpg');
// Or
this.setExtension('wav');
I think having SA's as "dumb" slaves would simplify the logic - they need a filename and some data. True we could send a file object but the SA doesnt make use of the logic. The fileworker should carry out the execution of beforeSave.
Regarding the sync - its something the fileworker should solve - I think if we could have it use the urls
and have it be intelligent about it we could have a simple and powerful pattern for triggering sync after upload, and a nice way of rerunning or synchronizing in general.
@raix, I'm hoping you're going to convince me soon, but so far not yet. :)
The beforeSave knows about extensions also if changed/converted - so it would save with correct extension and content.
The issue is not about saving with the correct extension. It's about the key being correct. If I'm understanding you correctly, here's an example of the problem:
bunny.png
. Knowing the collection, id, and store name, we pre-define the file's unique key as "/images/abcd1234/thumbnails/bunny.png"beforeSave
function that shrinks and converts to JPEG. The result is a file with name "bunny.jpg". So we now save as "/images/abcd1234/thumbnails/bunny.jpg".And regarding sync, I don't see how the fileworker can solve the issue of name conflicts. Here's an example of that:
beforeSave
, so we just save it into the filesystem SA folder. The key CFS creates using the convention is "/images/abcd1234/thumbnails/bunny.png" and we save it in "~/uploads/images/abcd1234/thumbnails/bunny.png". But oops, there was already a file there that was saved externally, so we've just overwritten it.The only way around this issue is to search the entire collection and look at the existing key for this store for every file and then alter our key if it isn't unique. That is not efficient, so it's much better to let each SA worry about ensuring uniqueness at the time of saving.
hehe, gotta do a better job then :)
$set: {'urls.mystore': 'collectionName/id/storeName/bunny.png'
file.urls.mystore
and get 'collectionName/id/storeName/bunny.png' - we prefix with mount point '/fs/files' or customName / key conflicts so - we allow rewrites - there will be no odd cases?
OK, so you suggest we move the calling of beforeSave
into the fileworker, which is fine, and that solves one issue.
This plan still doesn't allow us to know the URL before we save, as in my previously stated goal:
The GET URL should be constructable even if we haven't yet stored the uploaded file anywhere.
So I guess the question is: Is that really an important feature? The reason for wanting to be able to construct the URL before it's saved is to be able to do something like this:
MyImages.insert(file, function (err, fsFile) {
console.log("Thumbnail file available at", fsFile.url({store: 'thumbnail'}));
});
If we can't know the URL until it's saved, then this would usually (race condition) print "Thumbnail file available at null". But if URLs are constructable without first saving (i.e., if they don't include an extension), then the url
method will never return null
so we can immediately know what the URL will be.
True the url will not be present in urls
before the fileworker stored the file to that SA - in meantime - what we could allow is to return the dataUrl if no store found. The cloudFS does this - displaying the image when upload has begun - its handled reactively so when the urls
contain the real image that will be loaded instead and the memory released (via the upload queue).
(This is only relevant for images and files of a certain maximum size)
We could also, if viewing remotely display a spinner while the upload is progressing or a progress indicator.
Btw. The url is constructable since the filename is just there for sights we could query for the real key: 'collectionName/id/store' - But: it does not matter since the file cannot be retrieved since the SA havent completed.
The url is constructable since the filename is just there for sights
If we use a dummy filename or no filename at first, that affects caching.
Maybe we just give up on the "always constructable" requirement. One will have to use deps or observes to wait until hasCopy()
returns true and then call url()
at that time.
Well - its constructable - but we can only be sure that we can load the file when the url is in urls
- Its true about caching - except using urls without a filename could be resolved and redirected on the server (kind of the signed url scheme) - we should be consistent about urls.
But what do we expect to see when uploading a file - well I like the idea about showing the image from local while uploading and then switching when ready.
If file not found eg. a remote user is uploading it would be ok to show a loading gif indicator? (this could be done by a temporary redirect on the server)
Oh my, what have I done here... :-) (jk, this discussion has obviously been ripe to happen for a while.)
It is extremely tempting for me to jump into this conversation with a bunch of concrete proposals of my own based on hard earned experience. I have a long history of working on precisely the types of use cases you are hoping to address. But this is your project and you've clearly put much more thought into what you'd like to build here than I possibly can. Also, this is not what I do anymore, and I need to stay focused.
So in lieu of making a detailed set of counter-proposals or requirements, please indulge me in considering a few general points:
CollectionFS: Meteor webbased filesystem handling up and downloads.
IMO, to really succeed, CollectionFS needs to do this critical and non-trivial thing well. And thus far the core CollectionFS
and FileFS
constructs go a long way toward achieving that. Congratulations.
However, it is has become clear in reading the dozen or so comments above this one that the vision for CollectionFS is expanding considerably to encompass more of what I think of as "Content Management System" (CMS) functionality. In particular, the planning process seems dominated by use cases involving photo sharing (e.g. automatically generating thumbnails and other alternately resized images, etc.) This is an important use case -- perhaps by volume of files currently the most important -- but it is only the tiniest slice of the possible interesting uses of what you are building here. So my general concern is that you may succeed in building a very good piece of photo (and perhaps audio/video) sharing infrastructure, at the cost of building the best possible general file store for others to use in ways you cannot yet imagine. I understand that this may not be your goal, but I argue that even for your own sake it probably should be.
And so my advice is to focus feverishly on creating a simple, efficient and general purpose file store; and then let a thousand flowers bloom, cleanly abstracted on top of that. One class of such things will certainly be CMS packages, and you yourselves may write the very best one, but work hard to make that a separate thing built on top of the general abstractions of CollectionFS. When thinking about the users of the core CollectionFS, try to imagine all of the different applications involved in this list. Your photo sharing use case touches a tiny fraction of what is out there to be done, even if it seems like the most relevant and pressing immediate concern.
So a few concrete observations/suggestions for thought:
_id
might be associated with different files simultaneously. If true, that will lead to insanity. In a true filesystem the thing with an id
must be a file: a single, specific blob of bits that may exist in multiple places (some of which may be stale/invalid), but which has only one valid accepted retrievable state at any given instant in time. id
s with that. If you choose what I will call an "asset" or piece of content (e.g. all of the current resized/format converted versions of an image), then you are not building a file system, you are building a CMS. Analogously, if you add the requirement that all current and prior versions of a given file must be retained, you would be building a revision control system, and not a file system, and your id
ed atomic unit (in a system like git) is what we know of as a "commit".owner
). My advice is to put everything you possibly can in there and strip the functionality of your core CollectionFS/FileFS constructs to the bare minimum. IMO, there should be nothing specifically built into CollectionFS for use cases such as connecting an uploaded image to its thumbnail. That can be easily accomplished by a CMS layer built on top of CollectionFS that knows how to correctly respond to:
http://localhost:3000/cms/images/universe.png
http://localhost:3000/cms/images/universe.png?thumb=true
http://localhost:3000/cms/images/universe.png?original=true
http://localhost:3000/cms/images/universe.png?resize=1200x800
...I've written more than enough, and I hope I don't seem preachy or come across like you don't already know much of this, because you are clearly smart, skilled and motivated enough to get this right. But I saw enough red flags in the above (now much too long) thread that I felt duty bound to relay some of the dangers I see as an experienced outsider. I sincerely hope you find it constructive and useful.
@vsivsi, generally speaking, I think we're on the same page. We're certainly not planning any versioning (at least I'm not), and the "CMS" aspect is limited to the fact that we're allowing multiple "copies" of a file to be created and stored. (The terms "copy" and "store" are somewhat merged now. Multiple stores can be used to save the same copy in multiple places, different copies in the same place, or some combination.)
The goal is for it to work without much configuration out of the box but support additional things you might want to do.
In the simplest case where you have just one store/copy, a client will be able to request the file back without knowing anything about the stores (i.e., I store a file, I'm told its new ID, I use that ID to retrieve the file back). If a client does happen to know that other copies of the file have been stored, it can request a particular copy.
Currently the anatomical unit is the uploaded file. The id corresponds to this concept. If this file has actually been saved in various formats in various stores, then that information is found in the copies
metadata of the uploaded file's record. You say that this means we are building a CMS, but I can't really envision a good file storage package that does not allow you to manipulate and potentially duplicate a file before storing it. Once you allow that, there has to be some way then to tie the various copies together. If you have concrete suggestions in this area, I'd be happy to hear them.
As for sync, that probably won't be implemented immediately, but we're trying to make sure we don't do anything architecturally that would preclude it.
@vsivsi no worries :) we have been discussing a lot of stuff the past 6 months in and out of issues - it seems to be a good way of bending each others minds, we do try to find the best pattern or maybe a whole new way of considering files. Its a balance because our main goal is: we want this to be modular/reusable/flexible, fast, light and easy to use.
Multiple filehandlers/copies or stores pr. file is one of the parts that we often need when speaking apps - we have a lot of focus on images since its a very common usage but we also allow stuff like conversion from one sound/video format to another or text to speech etc. its very flexible.
That said you can actually leave out the cfs-fileworker package, the beforeSave and multiple stores support should go away. (when we are done refactoring) You would have the simple use:
var images = new FS.Collection('images', {
store: new FS.Store.FileSystem('files', '/myfiles')
});
url = images.findOne().url(); // returns the first store if nothing is specified
With the flexibility to easily select your storage adapter
Sync is not going to be a feature in the first release, but as @aldeed said we should have it in the architecture. Sync is now days a fs feature, we could have a dropbox or google drive storage adapter - even the filesystem sa should be able to start a sync. In the current architecture sync isnt that hard - but it requires that we keep track of a universal timestamp for conflict resolution and have a way for the sa to trigger a sync operation. Thats a package to come :)
I guess FS.File and FS.Collection is the core - Everything else is replaceable.
@aldeed I tend to agree with you on the point where we throw an fileObj at a sa and have the sa figure out the reference. Ideas:
cfs-upload-manager
that takes chunk of data and when done contacts the relevant sa.@vsivsi The client needs not worry about stores in the constructor in the future - I think we are going back to the old pattern here. Its mainly due to the recently shift of upload pattern to http.
// client
var images = new FS.Collection('images');
We plan to release before April - mainly need to finish refactoring and write tests.
@raix, re your bullet list, you must not have looked at the new issues I created yet, but you'll be happy to know that they happen to be calling for exactly the same changes as you suggest in your list. :)
Cool - two heads one mind :) I'll take a closer look
I believe other issues have been opened for anything unresolved in this one.
i wish this can be help :)
PostImages.allow({
insert: function() { return true },
update: function() { return true },
remove: function() { return false },
download: function() { return true },
});
i have that problem too, and i just allow for download: function() { return true }
First off, kudos to the devs for the new version of CollectionFS, it's looking great!
In the docs I see references to the HTTP access point, and it is easy to verify that this is working (for download via GET) using the {{url}} handlebars helper. So I see something like
http://localhost:3000/cfs/files/filesColl/zjAqp9ceEZPeezyWy/filesGrid
, which gives the API pattern for an existing file.Question: How do I use the HTTP access point to upload a new file, or remove/replace an existing one? What I'm looking for is the equivalent to the CouchDB file attachment API here. I see lots of comments suggesting that this functionality already exists, but my initial attempts to ferret out how it works have failed.
I'd be happy with something as simple as an example
curl -X PUT -T 'filename'
based file upload command. All of my attempts thus far have resulted in a hail ofNot Found [404]
andError: APUpload expects binary data
messages.Any help that prevents me from having to reverse engineer how this works would be much appreciated!