Closed mitar closed 11 years ago
Kind of like having a nat server doing a torrent?
I think it would make sense to have that feature - I think you could create a torrent file on client side an then have a filehandler parsing it and fetching + creating the files.
But it would be nice to have a storeRemoteFile
for this - creating the file object and have a build-in functionality for this. After the file is loaded into the db it should be treated as a normal upload, having custom filehandlers running for creating cached / transformed versions of the file etc.
It's a nice idea - any thoughts on api would be great, might have a look at it some time this week, I'm thinking something like:
var newFileId = ContactsFS.storeRemoteFile( url );
Or.. do you speak of the serverside implementation of ContactsFS.storeFile
? Would be implemented when doing the above - since it's allmost the same - It should take a Buffer
or an url
as parametre and store that in db
Huh, I do not know where you got torrents from. In my case, it is so that we can go around CORS issues where clients cannot access resources from other domains directly. So server can download this instead and then deliver it.
So in my case it is not even client who triggers the download. We prefetch things based on documents we have in our database. Then we download and store them. And when client comes, we deliver it locally.
I am not sure if it is so easy to implement storeRemoteFile
. Because there could be many ways a file can be retrieved. For example, sometimes there have to be some HTTP headers set, or on the other times we have to access files stored in S3 bucket and get it locally. So I would just do storeFile(file)
and leave to the caller to get the file.
Why are you writing ContactsFS
and not CollectionFS
?
ContactsFS is just an example also used in the readme file,
I wasn't sure if client triggered server download (some nat servers take a torrent file and goes fetching those files)
But I can see that it's the server side version of storeFile
you need, I'll try implementing it next week
Great! Thanks. So if I understand correctly, I will be able to store the file on the server side and then clients will get this in published files collection? I am yet unsure, but how do I then get content of this file on the client side? I would like to have it or as a buffer already or as an URL to which I can point ajax query, preferably both?
yep, but you can write custom publish/subscribes.
There are two ways for the client to get the file:
ContactsFS.retrieveBlob // loads file into blob from database
For more: https://github.com/raix/Meteor-CollectionFS#2-adding-the-controller-client-1
or you write a simple filehandler just caching the file on the server and then use the fileURL
to set src or href in html.
Filesystem.fileHandlers({
cached: function(options) {
// NOP
return { blob: options.blob, fileRecord: options.fileRecord };
}
});
{{#each Files}}
<!-- -->
{{#each fileHandler}}
{{#constant}}
<img src="{{url}}" alt="{{filename}}" width="20px"/>
{{/constant}}
{{else}}
No cache
{{/each}}
<!-- -->
{{/each}}
What does "caching" means? Where is that cached? I do not understand this because I would like that files are stored on the filesystem anyway?
So if I understand the schema, you are using DDP for both uploading and downloading the data? Hmm. I am a bit skeptical about this. I understand that for uploading this is nice. But for downloading it would be great if it would be possible to, for example, redirect to S3 so that client directly access file there, or that you can locally deliver through your HTTP server. For development DDP is probably great, but for production?
(Maybe I am just not used enough to this reactive nature of Meteor. :-) And maybe I wrongly want to do things traditionally. Please correct me.)
I created the filehandlers to give extactly the option to handle the db files when they are uploaded eg. Saving / caching to the filesystem, uploading to other servers etc.
You Can create multiple filehandlers pr. CollectionFS, eg. if you want to create different image sizes, Sound formats etc.
When saving to the file system you actually use http to download (just like normal) the urls are placed in files.fileURL array of {path}
Filehandlers Can write to filesystem or just return a blob and let the system handle writing the file etc. It's described in the readme.
When a filehandler is done the file is updated and all clients are updated too, meteor handles this pretty cool,
Have a look at collectionFS.meteor.com (use chrome) try uploading a jpg. The 5 images are generated by 5 filehandlers all just making a filesystem version.
Hi @mitar,
I've created the serverside storeBuffer
and retrieveBuffer
- there are examples in the readme and in the filemanager example.
Also checkout the new http://collectionFS.meteor.com - Added examples of drag&drop, server side create file, filters and more.
Let me know if the work for you,
Great, will check it out soon!
Hm, would it be possible to have also storeStream
? There is probably no need to create whole thing in the memory?
@mitar I'm thinking about making a storeRemote()
It would use streams for the job.
It would have options to set auth, headers etc. And would be wrapped inside fibers/future to make it sync for Meteor.
When the file is loaded it would trigger filehandlers if added, to do their thing
I've prepared it so that storeRemote
could be triggered by the client, the client could set options: headers/auth too - the options would only be published to owner.
Guess a storeFile
that reads from filesystem could be nice too
No, this is to limited. There are many many sources of streams, not just HTTP. For example, I could have a Bittorrent client. :-)
OK, in reality, I am using AWS SDK to retrieve an object from S3. So see, it is not really reasonable to support all possible sources of streams. Just make it compatible with writable stream. Something like:
stream.pipe(ContactsFS.storeStream('My stream file.txt', {
// Set a contentType (optional)
contentType: 'text/plain',
// Set a user id (optional)
owner: 'WAaPHfyfgHGaeJ5kK',
// Stop live update of progress (optional default to false)
noProgress: true,
// Attach custom data to the file
metadata: { text: 'some stuff' }
}))
Ahh, I get it, I'll have a look at it next week, got some deadlines this week.
I'm thinking that the _id
of the file might be important to some?
Maybe:
var newStreamFile = ContactsFS.storeStream('My stream file.txt', {
// Set a contentType (optional)
contentType: 'text/plain',
// Set a user id (optional)
owner: 'WAaPHfyfgHGaeJ5kK',
// Stop live update of progress (optional default to false)
noProgress: true,
// Attach custom data to the file
metadata: { text: 'some stuff' }
});
// Returns { _id, stream }
console.log('Created file record file id: ' + newStreamFile._id);
// Get the data
stream.pipe(newStreamFile.stream);
Hm, in fact, the question is how well does streaming work with Meteor & Fibers.
But yes, something like this could also be done. And how could I do the progress bar?
Do you know the size of stream? One of the things I want to look more into and be able to set chunk size in stream would be nice too
But I guess leaving out noProgress: true
should keep the fileRecord updated - no mather what it updates a complete.
I'm also thinking in resumablity of a stream - if stream fails it should be possible to try again - without loading from the beginning - why I'm thinking about storeRemote
to handle all this for you (having it retry, wrapped in sync) - But I'll have to think of a way
Maybe more like setting stream as a parametre with options of id
and length
- if no length
then noProgress
would default to true
, if id
then resume:
var id = ContactsFS.storeStream('My stream file.txt', stream, {
// Set id to resume
id: fileId,
// Set length
length: filesize,
// Allow it to run async
callback: myCallback, // Default would be sync in fibers
// Set a contentType (optional)
contentType: 'text/plain',
// Set a user id (optional)
owner: 'WAaPHfyfgHGaeJ5kK',
// Stop live update of progress (optional default to false)
noProgress: true,
// Attach custom data to the file
metadata: { text: 'some stuff' }
});
Yes, I can have length. At least in my case. So I get length in HTTP header and then I download.
BTW, I still don't know how can I store files only to the hard disk without GridFS.
The filehandlers would do that for you, saving eg. to disk when run - they would handle the file when its ready and server got time for it.
All data is handled via the database, but one thing I've been thinking about is having the abillity to set an option for using the filesystem to store the file/ chunk data instead - I would still use the database when data comes from client data chunks might not be ordered - but a default filehandler could then save to disk and empty chunks and set a filesystem reference in db fileRecord.
But data integrity wise the database is a good place for files, speedwise I dont think there much difference filesystem and databases, they are basicly the same. (would be nice with some benchmarks on it - db vs. filesystem) So when speaking about having it only on disk its only about saving storage.
Hm. Why to the database? If database is not on local machine, I do not really want data to be sent over the wire to some database.
No, filesystem has a nice advantage that you can use existing distributed/caching filesystems and other tools. It is also much easier to move them around, to cloud storages and so on. Furthermore, you do not have to have files stored twice, in the database and locally.
true, why I made the filehandlers to exploid the existing infrastructure with caching etc. but the db makes it all possible, when you use the filehandlers the browser loads the file via http from the filesystem/cache.
But what I'm saying is, the problem could be solved - if one could do:
Declare a filehandler named 'master', then this could trigger that the chunks would be removed from the database - all other filehandlers would get handed the master file - not the database file.
The master could do all normal stuff, changing size of image (limit data usage on filesystem)
I must say that maybe I do not understand yet enough about this filehandler stuff. Could you create an example which stores data only on filesystem?
Anyway, also storing temporary files in the database is probably an overkill, sending things around the wire ... hmm.
Basicly I made the filehandlers because I wanted a "cached" version of the file on my filesystem - I dont believe sending large data to client should be as ddp, rather http (ideally in this case ftp? since its optimized for file transport...)
When a file is uploaded in collectionFS its in the database.
actually in two collections suffix'ed .files
and .chunks
.files
holds info/fileRecord about the file, length, owner etc. and fileHandlers that are completed
.chunks
holds the data sent from client and makes resume possible, even multiple user could upload same file at the same time from different locations.(not implemented, cant find a usecase for that...)
chunks are not available on the client, they are saved/loaded via method ressemles ajax?
When the upload is complete and filehandlers are defined it puts file in a queue on the server
When server has time it runs one filehandler at a time, filehandler that are applied to the collection (these are custom functions you have to define - the readme and example shows how)
If one of these functions failes (returns false) it retries 3 times and goes to the next task or filehandler. If server gets bored it sleeps a period, then crawls to se if any new filehandlers are defined, and it tries to run failed filehandlers again, thinking that they could rely on a remote server and connection
If a filehandler returns a buffer and filerecord then the server handles the buffer by saving it to the filesystem and updates details in fileRecord. (your file is now on the filesystem and the fileRecords fileHandler holds info about it eg. url and extension pr. fileHandler)
The filehandler is passed an argument options
it holds a buffer (maybe a stream in future for memory?) a fileRecord and a helper function destination
where you can set an optional extension and gets returned serverFilename
and fileData.url
+ fileData.extension
But have a look at the example, shows how to get urls into the templates, it's all reactive why the db is nice.. so the files would appear in real time as filehandlers are working through (try pressing the reset filehandlers in the example to se this in action)
Well I kinda agree, but its a complicated pro/cons - I weight flexibillity/security, so it makes life easier for me - Depending on the setup some systems have dedicated fileservers - here data goes over the wire too when server handles files?
Update... If filehandler dont return a blob it could be because:
destination
to get a safe place) - so it returns only the fileData (an url and extension) - the server updates the filerecord with thisVau, thanks for this explanation!
OK, but then data is stored twice? In the database and processed on the disk?
And no, FTP is not really much more optimized for files than HTTP. But it is an old protocol, requires open ports on client in original specification.
yep, at the moment - but I've made an issue #34 for making db temp - delete the chunks in db when a master file is generated.
I'm closing this look at #77 Some of the new stuff are full http rest api, http fileserver and storage handlers for:
This way one can serve a storage handler directly to a fileserver point, have filehandlers save into some storage handler making it a very flexible setup.
All results in smaller reusable packages
Vau. Great! Will check it out.
@raix and @mitar along the same lines, I have a Uint8Array containing the data for a png image that I convert to a base64Srting. Is there any way to store this in CollectionFS? More clearly could CollectionFS use a data string such as this: 'data:image/png;base64,' + base64String + ''", where this string is a url of the stored image? Hopefully that makes sense, if not I can explain myself a little more. Thanks for your thoughts.
@nspangler is it client side and are you on the old cfs or new devel-merge branch?
@raix it is client side and the old cfs.
I think it would be better to use the devel-merge or wait a week or two, think we'll push v2 out. There are ways to convert to/from base64 and blob. Some use canvas object for converting. We could perhaps make the v2 accept base64 data / data urls - as we should also beable to get a file as a data url. Should prop add this as a seperate issue?
@raix, one option for adding support in devel-merge would be to update fsFile.setDataFromUrl()
so that if the URL begins with "data:", then load data from the data string. However, it seems like it would be rare that you would have a base64 string but not also have the binary/arraybuffer/blob, which you could insert directly. So I don't know if it's worth the effort.
@raix and @aldeed I agree with what aldeed is saying. I have decided to go a different route. I have a stored a file from an objective c application to my collectionfs meteor mongodb collection. The file saved is both contactsFS.chunks and contactsFS.files. At first it was erroring out meteor due to the length
parameter, however I changed the length
name so it would not error out. However now the file cannot be accessed by CollectionFS. This is because the file has never gone through a filehandler. How would I make that file inserted into the collection from an outside source go through the file handlers. I have tried using the ContactsFS.retrieveBlob() then reinsert it back into the collection but that has been an ill attempt. Right now I have the raw image sitting in my contactsFS (see screen shot below:)
I'll have to look deeper into this - length
should be converted into string - Meteor's usage of underscore is causing this issue #594 in Meteor (only issue marked "confirmed")
@raix thanks. Is there a way to access the contacts.chunks in meteor. ContactsFs.find() only returns the contacts.files. If I can access the .chunks then I can use the BinData and build an image off of that.
Remember to set encoding to binary
Hi, guys! I'm very new to Meteor and trying to use CollectionFS package, but when I'm trying to define a file handler like so:
Filesystem.fileHandlers({
default1: function(options) { // Options contains blob and fileRecord — same is expected in return if should be saved on filesytem, can be modified
console.log('I am handling default1: ' + options.fileRecord.filename);
return { blob: options.blob, fileRecord: options.fileRecord }; // if no blob then save result in fileHandle (added createdAt)
}
});
I'm getting an error: ReferenceError: Filesystem is not defined
Could you please help me figure out what the problem is. Thanks!
Oops, I got it! Stupid me :) It should be myFileSystem name instead of Filesystem.
@iliaznk check out this example.. https://github.com/mxab/cfs-multi-filehandler-example
it should be yourCollectionName.fileHandlers({...
@nooitaf yes, that's what I meant, thank you! I just have one more question: how do I specify where to save the file on the server?
with cfs v1 you cant.. they get stored in .meteor/local/cfs or when deployed in ../cfs
Ok, then, is there any way to serve the saved files (images in particular) as urls for img src="'?
@iliaznk https://github.com/mxab/cfs-multi-filehandler-example/blob/master/app.html#L54 I'm getting a bit rusty on the V1 api, why I link to the code, hope it's ok,
@raix thank you! But I've tried that before and it's not working, images won't show up. Probably because Meteor does not serve any static files outside of the Public dir?
Should work, @nooitaf made a cfs-public-folder that v1 depends on, check if files are stored on the file system as @nooitaf mentioned.
Btw. Merry Christmas y'all :)
@raix thank you! Merry Christmas to you too!
Something weird is going on. I've seen the files stored there before. Then I reset the app and the dir was gone. Then I uploaded a couple of files but the dir still wasn't there but a file named as the path to the dir appeared with a lock symbol at the beginning.
Then I reset the app again, uploaded files, now I got the dir, but when I go there it's not showing anything, not like an empty dir would look like, but as though it's not letting me in and hiding the contents, you know... Sorry, guys, I'm not a linux pro yet, hopefully I made clear enough.
Is it possible to use CollectionFS on the server side? So that my code on the server downloads a file, puts it into the CollectionFS, which can then be downloaded/viewed by the client?