orhun / rustypaste

A minimal file upload/pastebin service.
https://blog.orhun.dev/blazingly-fast-file-sharing
MIT License
757 stars 47 forks source link

Allowing predefined filenames when using random_url #217

Closed seqizz closed 6 months ago

seqizz commented 8 months ago

This is the fruit of the discussion #216 .

Rustypaste can allow bypassing the random_url server setting with an explicit dict parameter like:

random_url = { type = "alphanumeric", length = 8, enforce = false }

(true being the default on this example)

So clients can send expected file name by filename parameter on multipart request to place the file on requested path. E.g.

$ curl -F "file=@x.txt;filename=wow.nice" http://localhost:8000
http://localhost:8000/wow.nice
tessus commented 8 months ago

It's not a bad idea, but how would you handle the following 2 cases?

seqizz commented 8 months ago

Looks like if we're not using random_url, the "filename" in the multipart request overwrites the data (at least on v0.14.1):

$ echo test1 > test1
$ echo test2 > test2
$ curl -F 'file=@test1' "http://localhost:8000"
http://localhost:8000/test1.txt
$ curl http://localhost:8000/test1.txt
test1
$ curl -F 'file=@test2;filename=test1' "http://localhost:8000"
http://localhost:8000/test1.txt
$ curl http://localhost:8000/test1.txt
test2

Weird default in my opinion, I'd normally suggest returning HTTP409 for conflict for such cases. But if you don't want behavior change we can simply overwrite it in this case too.

For the second question, another issue is already there: https://github.com/orhun/rustypaste-cli/issues/82

tessus commented 8 months ago

Looks like if we're not using random_url, the "filename" in the multipart request overwrites the data

I see, I guess in that case it should be kept consistent.

Weird default in my opinion, I'd normally suggest returning HTTP409 for conflict for such cases.

Hmm, I am sure orhun had a reason for this. For me it makes sense, if I think about "updating" a file. Let's say I upload a file and a few minutes later I find out that there is data missing or new data is available. Previously there was no delete option for files on the server, thus the best option was to just overwrite the file. Now with the delete option available, it could make sense to add another option (e.g. overwrite = true/false). Either way, this is probably not in the scope of this issue, but a separate one.

For the second question, another issue is already there: https://github.com/orhun/rustypaste-cli/issues/82

Oops, sorry, I didn't check the cli repo.

orhun commented 8 months ago

Actually the reason why we overwrite the file is that this case was never considered 🙈 I think it would be best to return an error as default and have a configuration option to change this. As @tessus said, this discussion definitely belongs to another issue.

tessus commented 8 months ago

I was just wondering whether the enforce option was even necessary. You usually don't set the header filename, unless you want to override/set it anyway. Or maybe I don't recall the code correctly. I'll have to look at it again. On the other side, such an option would be an additional safeguard, so it's not a bad thing to have either.

orhun commented 8 months ago

Yes, I agree.

Would you be possibly interested in taking a stab at this btw @tessus?

tessus commented 8 months ago

Yes, I would, but I am not sure when I'll get to it.

orhun commented 8 months ago

No worries, just let me know. Anyone is free to pick this up in the meantime!

tessus commented 7 months ago

I just started to implement this and there is a problem that is not solvable. At least not as is. This is not an issue with the client or the server, but how headers work.

The headers do not differentiate between the following 2 invocations with an additional flag:

curl -F "file=@x.txt;filename=wow.nice" http://localhost:8000
curl -F "file=@x.txt" http://localhost:8000

This means that the server does not know that the client had overridden the filename. The server only receives the filename in the content data. So there is no way to actually make this work.

However, if we do something like this, we can make it work:

curl -F "file=@x.txt" -H "filename:wow.nice" http://localhost:8000

Another option would be to set another header, e.g. something like override=true, override=1, ... in which case the random generation is bypassed and the filename that is found in the content data is used.

e.g.:

curl -F "file=@x.txt;filename=wow.nice" -H "override:1" http://localhost:8000
seqizz commented 7 months ago

Thanks for getting a stab at it.

I am mostly potato on http, but if we're sending this as part of form (as first example) afaik it will be represented under body instead of header on server side (named Content-Disposition). Not sure about the implementation, maybe we don't accept the data before we're done processing the standard headers, but can't we parse from there? :thinking:

// edit: Also sending the override name directly as header with -H filename:wow.nice might also work, from end user perspective (if that's easier).

tessus commented 7 months ago

it will be represented under body instead of header on server side (named Content-Disposition

Exactly, but the issue is that the server does not know that you used the ;filename=... suffix in the form. It only sees the filename. The server does not know that you gave it a name different from the real filename. Thus the server can't take the branch not to generate a random filename.

Update: Which is why I suggested 2 ways we can make this work:

curl -F "file=@x.txt" -H "filename:wow.nice" http://localhost:8000
curl -F "file=@x.txt;filename=wow.nice" -H "override:1" http://localhost:8000

I just need input from @orhun which one to implement.

tessus commented 7 months ago

I think

curl -F "file=@x.txt" -H "filename:wow.nice" http://localhost:8000

is easier to implement, especially for the cli app. (Although it is very much possible that my opinion is incorrect.)

But

curl -F "file=@x.txt;filename=wow.nice" -H "override:1" http://localhost:8000

would probably be more in line as to how curl is currently used when random is off.

tessus commented 7 months ago

@orhun you might not have seen above update, because I mentioned you in an update and I think GH doesn't send notification in such a case. If you are busy, I'm sorry to bother you with this.

orhun commented 7 months ago

@tessus sorry for late reply and thanks for the ping!

I think -H "filename:wow.nice" is much cleaner and more intuitive than using both form and headers so let's go with it if possible.

However, I also think we can simply use the header::ContentDisposition struct for mitigating this problem. For example:

self.inner.parameters // [Name("file"), Filename("wow.nice")]
self.get_file_name()   // wow.nice

Or am I missing something?

tessus commented 7 months ago

However, I also think we can simply use the header::ContentDisposition struct for mitigating this problem. For example:

True, but we still have to tell the server not to generate a random filename. The server only sees the filename. get_file_name() will return the following:

"file=@x.txt;filename=wow.nice" -> wow.nice "file=@x.txt" -> x.txt

So how would the server know not to generate a random file name?

orhun commented 7 months ago

We can probably check self.inner.parameters contains Filename(...) and don't generate a random file name if it exists.

tessus commented 7 months ago

This would be great. Maybe I was mistaken then. What does self.inner.parameters show for "file=@x.txt"? Because get_file_name() definitely returns x.txt.

Either way, adding this to the CLI app will be complex, since it currently does not support passing args, which are not in the config file, to the upload functton.

orhun commented 7 months ago

What does self.inner.parameters show for "file=@x.txt"

[Name("file"), Filename("x.txt")]

I think I see the problem now 😃 sorry it took a while.

Then I think the best approach is to use a header.

tessus commented 7 months ago

I think I see the problem now 😃 sorry it took a while.

No worries. It took me a while too.

Then I think the best approach is to use a header.

Right, but which one? ;-) Haha. filename:wow.nice or override:1? There are pros and cons for both options. I've been trying to wrap my brain around which one is better.

orhun commented 7 months ago

I personally like filename:wow.nice, what do you think?

... we should have chosen a better file name as an example, this guy plays in my head every time I read it: ![image](https://github.com/orhun/rustypaste/assets/24392180/586b9a95-fd25-4639-a398-0cb296e64fd6)
tessus commented 7 months ago

Yep, let's go with that. We only have to specify what the behavior should be when using:

curl -F "file=@x.txt;filename=wow.nice" -H "filename:wow.evenbetter" http://localhost:8000

for random off (when on there's no choice anyway).

I think the header filename should take precedence. filename will be wow.evenbetter. What's your take on this?

orhun commented 7 months ago

I think the header filename should take precedence. filename will be wow.evenbetter. What's your take on this?

Agreed!

tessus commented 7 months ago

we should have chosen a better file name as an example, this guy plays in my head every time I read it:

btw, so true... :-)