File Folder Annotation - Githubissues

swhitley commented 11 years ago

I think it's important to offer users a way to organize file uploads. I'd like to suggest the following annotation to help with that:

{
    "type": "net.app.x.filefolders",
    "value": {
        "path": "folder1/folder2/folder3/folderN",
    }
}

path = A slash-delimited list of virtual folders.

Clients would read the annotation, parse the folder names, and represent the folders using standard hierarchical folder conventions (as appropriate for the client). The file would not be included in the path. The file is assumed to be contained within the right-most folder.

Folders are dependent on files. You cannot create a set of folders without a file. Likewise, when the last file is removed, the containing folder(s) will no longer exist.

berg commented 11 years ago

Good idea. I think this is a good candidate for a core annotation.

There are some issues that I see, mainly around escaping/path separation. We could sidestep them altogether by supplying a list of path components instead of expecting them to be slash-separated. What do you think, @swhitley?

swhitley commented 11 years ago

@berg Thanks for the thoughts. Something simple like below? I'd just want to make sure that everyone understood the relationship between array index and folder level (i.e. index 0 folders would all be displayed at the same level).

{
    "type": "net.app.x.filefolders",
    "value": {
        "path": [
            "folder1",
            "folder2",
            "folder3",
            "folderN"
        ]
    }
}

berg commented 11 years ago

Yeah, that seems sensible. Let me think about this a little bit more. I'm happy to make it core.

ludolphus commented 11 years ago

I have one concern, how are we going to build the 'entire' directory structure ? Don't think it is an option to first get all files from the api and then build the directory structure. I'm thinking hundreds/thousands of files here...

berg commented 11 years ago

We're not planning to build an internal tree index of files here, so you'd have to pull them all down.

ludolphus commented 11 years ago

yeah kinda thought that otherwise you would probably already have build something in the File api. Also from your latest podcast it's clear that the File api is not intended to be something like dropbox

ludolphus commented 11 years ago

could do some caching, but files can get deleted or moved to another folder even by updating the annotation. If you all do it within the same app then there's no 'problem', with multiple apps there will be

swhitley commented 11 years ago

@ludolphus Yes, it's the same with S3 and tools like S3Fox. There could be some inconsistencies if the cache is out of date, but I think it's still better than not having a method for organizing files.

strangebug commented 11 years ago

If this is not intended to build a complex directory structure what about just having a bunch of tags on the file ? My 2 cents

swhitley commented 11 years ago

@sklouvi The intent is to be able to represent a simple directory structure with the path attribute. Tags would be useful too and I wouldn't be opposed to turning this into a file attribute annotation.

strangebug commented 11 years ago

@swhitley The separeted path proposition is good. @ludolphus We can rely on File id to figure out what happen on a move/rename operation or did I get that wrong ?

ludolphus commented 11 years ago

The file id never changes. But you will have to do something (query the api) to handle deleted files. Building a cache of files is an option and then use pagination parameters to get new files. Only thing you won't know is if files already in the cache have been deleted. On accessing those files you will know or by doing a full fetch of all files at set interval e.g. once a day, to not hit the api too much and keep things fast in a client.

Maybe we need a include_deleted for files, just as there is for posts and messages. Querying a file that was deleted now gives you an error with code 403 and error_message 'Forbidden'. I would like to suggest return either a code that says the specified file was deleted or have the include_deleted option or even an endpoint to query deleted files.

neuroscr commented 11 years ago

I don't think pathing is a modern way to address this issue. We should look at tagging like sklouvi said, so a file can belong to more than one folder/tag. (Like Google docs) Tags can be hierarchical too.

mlv commented 11 years ago

Idea: have a type of file be a folder. It would contain the names and file ids that are intended to be part of that folder. Some of those ids could of course be other folders.

swhitley commented 11 years ago

@mlv That might actually be brilliant. Because you can filter by file_types, you could select all of the "...folder..." types to get the folder structure without having to enumerate every file.

I'll think about this a bit more.

berg commented 11 years ago

In the long run, this'll probably end up being less efficient than just pulling down files. To understand why, let me pull back the curtain on what actually happens when you request a filtered stream of objects (be it posts w/ include_deleted=0, files with a filter type, channels with a filter type, etc.)

We actually go through and fetch every candidate object, ~200 at a time IIRC, and thaw 'em out of the database, create several objects and do a comparison. If we haven't returned as many objects as you've asked for, we go back to the database, get another 200, etc. What that means is that we're iterating over basically every object to get the folders list. Now, obviously, there is a bit of a performance win by us doing the comparison, because we're closer to the database, etc., but just understand that these calls can, in the worst case, take several seconds to return anything depending on the order in which things were created, etc. So you might be better served by incrementally loading in data so that you can show users something that's not a spinner...

fwanicka commented 11 years ago

I am in the process of creating a client for the File Api. I would propose a native Folder Api be added before going too much further. IMO, using annotations as the folder structure is not maintainable in the long run. What happens when a user has 100,000 files? Caching the files on the client (especially a phone) doesn't seem viable, nor does downloading the list on demand. I understand that this not meant to be used like DropBox, but it seems like something needs to be added to make this manageable in the long run.

peteburtis commented 11 years ago

Hi folks. I just posted my app FileBase, which is a simple view into your file storage, to the app directory. I thought I'd put my 2 cents in.

After a lot of thought, I think FileBase will never be able to properly support a traditional, navigable, folder hierarchy on top of app.net, so I probably won't try.

The first problem is that folders often have an astronomically huge numbers of objects in them; often many more than even the folder's owner realizes. The Applications folder, on the Mac, for example, when you consider the fact that each app is itself a folder bundle, probably contains easily 5,000-10,000 or more objects. The rate limit for uploading objects is at best 20 per minute. So the user requests we upload the Applications folder and we do what? Throw up a dialog that says, "upload in progress, estimated completion in 2 months"?

The second problem is supporting empty folders. There's no good way to do it with annotations because there are no objects to annotate. Sure, we could come up with a standard that an empty folder is represented by a one byte object named "EMPTY", or something. But that sucks on non-conforming clients, and the vast majority of clients will probably be things that do other things, but store and retrieve objects in the files api, they won't have much incentive to conform.

My solution is probably going to be something along the lines of zipping any folder that the user asks to upload, setting an annotation that says this zipped object represents a folder, and then displaying it as a folder and unzipping it transparently on download. I think the sharing story with this solution works better, too. A user tries to share a folder, and they end up sharing a zipped version of that folder without having to do anything special.

To me, trying to graft a folder structure onto the ADN API feels like trying to make it do something it really can't do well. It'll lead to poor user experiences at best.

Sorry to be a downer, just one guy's opinion.

fwanicka commented 11 years ago

The only downside of that is you have to download a whole folder to get at one file. You could easily end up downloading 10 megs (or a lot more potentially) to be able to get one 100K photo. And the "folder" is then not browsable via a web client.

I'm not trying to nitpick. There really is no good solution. I just thought I'd point out the cons with that approach. I've shelved my File Manager app for now to see if there is some eventual consensus on this. I'm afraid there are going to be half a dozen folder schemes implemented by different devs, none of which work with the others. If so, the end-user experience is going to be terrible. They're going to be on their desktop client trying to find the file/folder they created on their phone, and it's not going to be there.

peteburtis commented 11 years ago

No nitpicking perceived. I agree, the bottom line is there's no good solution, and I wouldn't necessarily propose my solution as a standard; just something that makes sense from the perspective of the user when using my app specifically.

The zip thing is really conceived as a solution to the bundle problem which is specific to OS X. Some filesystem objects (Applications, certain documents) are really directories, but look to the user like single objects. There's really no way to throw up a dialog and say "that thing you thought was a single file is really a directory, and App.net doesn't store directories." So the zipping the object and uploading it makes sense, and in fact would probably be relatively transparent to the user across the board (most mac browsers unzip files on download, for example.)

Now that you mention it, I may end up not supporting "folders" at all, just opaque bundles. Or perhaps when a user drops in a folder, I'll throw up a dialog explaining that app.net doesn't support folders, and would they like to upload an archive of the folder instead?

I hope that client developers go forward with a first-do-no-harm mentality when it comes to folders (or anything else, really). Whatever crazy standards we come up with, the user should still feel right at home in a client that simply lists all files in the account, without parsing annotations or anything else.

berg commented 11 years ago

I've had a few side-conversations about this, but hadn't gotten around to updating this issue yet. From our perspective, the best top-level organizational object is the "type" key. This lets us separate things by "application" without siloing them into folders only accessible by a single app. We will likely add more granular permissions based upon requesting access to specific file types.

Second, the plan is allow users to define their own derived files. So each top-level file could actually be something like a bundle -- though I think in many cases the zip file approach would work well.

appdotnet / object-metadata

File Folder Annotation #7