jhoerr / box-csharp-sdk-v2

A C# client for the Box API (v2).
http://developers.box.com/docs/
11 stars 15 forks source link

File Upload #23

Closed everettevola closed 11 years ago

everettevola commented 11 years ago

I am working through the API you have written for box. My use case involves the user uploading a file, via an application I am writing, to box.com. Is the only way to perform this action via the "CreateFile" method with a subsequent "Write" using a byte array with MemoryStream?

I am so far successful with uploading files as I describe above. However, I am worried about very large files being persisted to box using a byte array over a MemoryStream.

Any comments or examples would be very much appreciated.

jhoerr commented 11 years ago

Hi Everett, thanks for getting in touch.

As of Jan 27th the BoxManager.CreateFile() can take a MemoryStream. This method effectively executes a file POST; no subequent Write() should be necessary. This is currently supported in the latest NuGet package (1.3) if you're using that.

This is largely cosmetic though. To the best of my knowledge, Box does not currently support chunked/resumable uploads, so I have to convert the entire MemoryStream to a byte array prior to uploading it. I'll double-check with Box to ensure that this is necessary. What sizes of files are you working with?

everettevola commented 11 years ago

Thanks for getting back to me so quickly. I took your advice and am now passing a MemoryStream as part of one of the CreateFile overloads. I was closely following one of your samples and missed this before. Much cleaner this way. It is now good to know what you are doing behind the scenes to shuttle the data to box. The file sizes I will be working with should vary greatly. From several KB up to probably a few hundred MB. On that upper end, how do you think this CreateFile() method will perform?

jhoerr commented 11 years ago

The sample code! Probably ought to update that, no? I don't have much experience uploading really big files -- the largest thing anyone has tried to push through our system is ~30MB. I don't have any reason to believe that the SDK would suffer for larger files. Please let me know if you run into any problems.

jhoerr commented 11 years ago

There's a relevant stackoverflow thread on the topic of chunked uploads to Box.

prexer commented 11 years ago

Thanks guys. I'll get the stackoverflow thread corrected. We do not currently support chunked uploads. We are considering it, but no definitive timeline.

Uploading 100MB files should not be any problem. We handle up to about 5GB files cleanly, as long as you have a reliable network.

everettevola commented 11 years ago

Thanks for the comments about chunked uploads. Looks like if Box can handle up to 5GB, we won't have any issues.

As I work through this SDK, I am seeing another potential issue. I am streaming a file to Box via CreateFile(). A file with the same name already exists on the root folder. I am getting an exception with a short description of "item_name_in_use". I am also getting this error if I invoke MoveFile() and attempt to move the file to a subfolder where a file exists with the same name. I expected the file to be versioned, not overwritten or exception thrown.

In a quick prototype I threw together using the old BoxSync SDK against the Box API v1, the files are getting versioned.

Is there another method or overload that I am missing?

jhoerr commented 11 years ago

Hi Everett,

You're not missing anything, and you bring up an interesting point. I hadn't considered the versioning aspect of this before. My gut response is that even though Box maintains a version history, I want to be careful about making assumptions on behalf of the user. Help me think through this:

If I attempt to CreateFile() and the file already exists, it seems appropriate that the SDK would puke and cry. My feeling is that the expectation set by the method verb Create would be violated otherwise. In this case you should proceed to Write() over the existing version.

Move() is a little more open to interpretation. It seems reasonable that we could have an overwrite flag that would version a pre-existing destination file. The default behavior would continue to be throw an exception if the destination file already exists.

Your thoughts?

prexer commented 11 years ago

There are also if-match and if-none-match headers that you can use in the API calls to make sure that you don't accidentally over-write a file that 2 people are concurrently editing.

I agree with you John. If you try to CreateFile() and there is a filename conflict in that directory, then the API should throw you back an error, and your SDK should probably do the same. Otherwise whatever magical renaming/versioning mechanism is used may not be what the caller was expecting.

You could argue that if the create fails, then an overwrite might be reasonable in some cases... but often it's best to bounce that back to a user to ask them the question "File already exists, overwrite, or rename?"

-Peter

On Fri, Feb 8, 2013 at 8:22 AM, John Hoerr notifications@github.com wrote:

Hi Everett,

You're not missing anything, and you bring up an interesting point. I hadn't considered the versioning aspect of this before. My gut response is that even though Box maintains a version history, I want to be careful about making assumptions on behalf of the user. Help me think through this:

If I attempt to CreateFile() and the file already exists, it seems appropriate that the SDK would puke and cry. My feeling is that the expectation set by the method verb Create would be violated otherwise. In this case you should proceed to Write() over the existing version.

Move() is a little more open to interpretation. It seems reasonable that we could have an overwrite flag that would version a pre-existing destination file. The default behavior would continue to be throw an exception if the destination file already exists.

Your thoughts?

— Reply to this email directly or view it on GitHubhttps://github.com/jhoerr/box-csharp-sdk-v2/issues/23#issuecomment-13297942.

everettevola commented 11 years ago

I have given this some thought and do agree with Peter, but the idea of a bit more flexibility on the developer side of this would be nice. I am really kinda on the fence to be honest.

The option to overwrite would be nice, but also along those lines, what about the option to version the file? These really represent two possible flags to be incorporated. Right now, I am not seeing any options to customize this behavior in this SDK.

With these two flags, the developer could really tailor the exact behavior called for in the use case.

Situations: :: Overwrite==false && Versioning==false, throw exception :: Overwrite==false && Versioning==true, create file and version as needed :: Overwrite==true && Versioning==false, overwrite existing file with same name :: Overwrite==true && Versioning==true, create file and version as needed, ignore Overwrite

The flexibility gained by adding these flags would enable the developer to customize a workflow to prompt the user that a file with the same name already exists from a caught exception. The user could then choose to overwrite or version, in which CreateFile() would be called again with the correct parameters.

As far as your concerns about the method verb Create, I completely agree. At first I was confused about the CreateFile() method because my brain was thinking file upload, not file creation. In reality, they are the same thing because the file is actually being created on Box via a byte array over a MemoryStream. I do not think you should add a UploadFile() method because that would muddle your API.

Just my 2 cents.

jhoerr commented 11 years ago

Versioning is performed automatically any time the content of a file is updated. To my knowledge there's no way to turn alter this behavior. (And IMHO that's exactly the way it should be.)

I need to think about the rest of what y'all have said. Lots to consider here.

jhoerr commented 11 years ago

[I just posted and deleted a comment in which I proposed adding a force parameter for Move() and Copy(). I'm now second-guessing this proposal because...]

The problem with providing an overwrite feature for Move() (and Copy()) is that it would require a full read of the file's contents into memory and then a write of that content over the pre-existing file. This would be a time-intensive operation for large files and folders and it has a certain "smell" that I'm not wild about. If we were to implement any sort of overwrite feature I think it would need to be all or nothing in that the pre-existing file/folder would first need to be deleted. This would allow us to avoid the read/write issue.

I'm not inclined to provide an overwrite feature of any sort for CreateFile(), as I am with Peter that it'd create a confusing behavioral expectation.

Your thoughts?

everettevola commented 11 years ago

Thanks for the followup. I am fine with not changing how file versioning is handled, or not handled. From an API perspective, everything you are saying makes sense. I am making sure that files are unique prior to upload to Box, so this is no longer really an issue for me.

I do have one other question related to file upload, so I thought I would post it here on the existing thread...

If I have a user who is uploading a file through my application, how can this file be uploaded directly to Box without first writing to the web server's file system? Is this even possible with your API? I am trying to avoid a double hop if at all possible. Any ideas?

jhoerr commented 11 years ago

The SDK will happily work with a MemoryStream, so in theory nothing necessarily need be written to disk. Are you working with IIS?

everettevola commented 11 years ago

Yes I am. I am using MemoryStream now to shuttle the contents of my file to Box, but only after saving the uploaded file to the web server's file system.

jhoerr commented 11 years ago

I'm afraid I don't know the answer. I think this is going to be a matter of getting IIS configured properly -- Stackoverflow or Serverfault might be a good place to look for help.

On Thu, Feb 28, 2013 at 12:53 PM, everettevola notifications@github.comwrote:

Yes I am. I am using MemoryStream now to shuttle the contents of my file to Box, but only after saving the uploaded file to the web server's file system.

— Reply to this email directly or view it on GitHubhttps://github.com/jhoerr/box-csharp-sdk-v2/issues/23#issuecomment-14246778 .

John Hoerr jhoerr@gmail.com 503.360.6581