Closed texnixe closed 3 months ago
In the lines written as the source of this issue, @bastianallgeier wrote the following comment:
move the file to a location including the extension for better mime detection
https://github.com/getkirby/kirby/blob/3.3.4/src/Api/Api.php#L764-L770
These lines provides following process:
/tmp/php8212.tmp > /tmp/5e5c24f38420d.abba.jpg
These lines provides following process:
/tmp/php8212.tmp > /tmp/5e5c24f38420d.abba.jpg
Yes, we have to check if this is necessary. If not, we should remove this step. If yes, we have to make sure the /tmp
folder is cleaned up again.
I‘d also prefer if there was a way to avoid the renaming. If not, we should add an instance var that keeps track of the files and a __destruct()
method that deletes the files from the list.
I just wonder that we already check and get the uploaded file mime type from tmp_name
naturally in previous lines. Why do we want to be sure again? 🤔
https://github.com/getkirby/kirby/blob/3.3.4/src/Api/Api.php#L757
@bastianallgeier?
Sorry for the late response. The mime detection sometimes failed on those tmp files. Don't ask me why. That's why it was more secure to rename the file and then check again if it actually has the mime it pretends to have.
@bastianallgeier OK, that makes sense. But what I don't understand is where the second MIME check happens. The callback is called directly after the renaming takes place.
The reason why MIME type detection always works with the correct extension is probably our explicit fallback. So what renaming the file does is to make our Mime
class fall back to the guessed MIME type based on the user-provided filename if it doesn't know the real type from the file contents (which will happen for all text files and other file types that don't have a magic number).
If we decide that we will continue to accept files where the MIME type cannot be reliably determined (which is what the current implementation does by relying on the user-provided data in this case), the solution would be to directly get the MIME type by extension if the automatic detection returns false
. The renaming would then not be necessary anymore.
The second check is done in the *Rules.php classes. If I remember correctly Firefox was the main offender in this case. It often does not reliably send the correct MIME type on uploads. It depends on the system setup but we had the issue in v2 and v3. I'm not sure if the MIME detection via extension is a good idea because of that.
We cannot rely on the MIME type the browser sends us as that can be faked by attackers, so detecting it ourselves (which we do at the moment) is already a pretty good solution. But as I wrote above: If PHP cannot reliably determine the MIME type from the file contents, we need to decide if we accept the file anyway or if we block it.
Currently we accept it anyway, in which case we could rewrite the uploader code so that it no longer needs to rename the temporary file.
I'm trying to track all necessary changes down, but am getting a bit lost:
$upload['tmp_name']
instead of $source
to the callbackBut do we even need this whole part in the upload method: https://github.com/getkirby/kirby/blob/master/src/Api/Api.php#L771-L782?
If we start later on mime checking again as well? Wouldn't it make sense to all move that to the rules class and/or Kirby\Filesystem\Mime
?
Remove https://github.com/getkirby/kirby/blob/master/src/Api/Api.php#L784-L790 and pass $upload['tmp_name'] instead of $source to the callback
Sounds good.
But do we even need this whole part in the upload method: https://github.com/getkirby/kirby/blob/master/src/Api/Api.php#L771-L782?
Yes, because we pass the resulting $filename
to the callback as the second argument. I think this ends up as the final filename after the upload.
If we start later on mime checking again as well? Wouldn't it make sense to all move that to the rules class and/or Kirby\Filesystem\Mime?
The Mime::type()
method already supports an optional $extension
argument that allows to override the fallback extension. So to replicate the current behavior without the physical renaming process we "only" need to pass the extension from $filename
to Mime::type()
when it is called from Filesystem\File::mime()
.
Idea: What if we add a new prop Filesystem\File::$filename
. By default that would be null
(= extract the filename from the $root
). But it could be set to something completely different for use in the extension
, filename
and mime
methods.
Example:
$file = new BaseFile([
'root' => '/tmp/example.tmp',
'filename' => 'my-image.jpg'
]);
$file->root(); // /tmp/example.tmp
$file->filename(); // my-image.jpg
$file->name(); // my-image
$file->extension(); // jpg
$file->mime(); // uses $this->extension() when calling `Mime::type()`
If we have this, we can just set this filename
prop here and here and everything will still work even without physical renaming.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
@lukasbestle I am not too comfortable with the filename
prop solution as I think ti could lead to weird other cases where we have a class that states the filename is A while it actually isn't.
I looked at the code again, which also has changed it bit since last year. If I still understand correctly, the issue here is that files are left in tmp
, not necessarily how they are named (just that renaming prevents PHP's automatic deletion). But why are we then not using F::move()
instead F::copy()
here: https://github.com/getkirby/kirby/blob/main/src/Cms/FileActions.php#L212? Shouldn't that resolve our issue of the leftover files?
But why are we then not using F::move() instead F::copy() here: https://github.com/getkirby/kirby/blob/main/src/Cms/FileActions.php#L212? Shouldn't that resolve our issue of the leftover files?
I'd say moving would be better than the current implementation, but it won't fix the root cause of the issue (which is that the file gets moved to another file inside the temp directory at all; this isn't how move_uploaded_file()
is supposed to be used).
This means that uploaded files will be moved from the temp dir if they are accepted by FileRules
, but they will still remain in the temp dir if they are rejected by FileRules
. This could actually be abused by Panel users to fill up the temp dir with large files that are not allowed by the blueprint.
I think we can only solve this issue properly by getting rid of the intermediate temporary file. To do that, we need some way of passing down the target filename to the $upload
object (see below).
I am not too comfortable with the
filename
prop solution as I think ti could lead to weird other cases where we have a class that states the filename is A while it actually isn't.
Nico and I discussed this directly. It is too risky to have this "fake" prop in the core Filesystem\File
class (could be used incorrectly quite easily). We came up with a few ideas (see below).
Our FileRules
class performs a MIME check on the uploaded file before it is copied to the final destination:
The same happens for file replacing:
Because PHP creates the temporary uploaded file without the user-provided file extension, this MIME check will fail for every file type that doesn't have "magic bytes" (so basically for every text-based file format). For those files we need the file extension in the MIME check as a fallback.
Currently we solve this by renaming the temporary uploaded file to contain the file extension:
Because we use move_uploaded_file()
, PHP will assume that the file was moved to the final destination. So it isn't cleaned up afterwards.
I think the only proper solution can be to get rid of that intermediate temp file. All the validation should be done on the original uploaded file.
To achieve that, we IMO need to make the following changes:
is_uploaded_file()
(if false
, throw exception as there could be an attack).$debug
argument of $api->upload()
.$upload['tmp_name']
to the callback in this line.$filename
in Filesystem\File
(see below).F::move()
in the step where the file is moved to the final destination (upload and replace).This will mean that the following happens on upload:
$source
(new). They would still receive the "clean" filename as $filename
(like before).$source
and $filename
variables as props to HasFile::createFile()
.HasFiles::createFile()
passes these props 1:1 to FileActions::create()
.$source
to $file->replace()
.FileActions::create()
and $file->replace()
somehow create an $upload
object that uses the $filename
as an override (new but still uncertain, see below).FileActions::create()
and $file->replace()
call $file->commit()
, which calls $fileRules->create()
/$fileRules->replace()
with that $upload
object.FileRules
methods call the $upload->mime()
, $upload->match()
and $upload->validateContents()
methods. The replace rule method also calls the $upload->extension()
method.$upload
methods can now refer to the target filename (new).F::move()
in FileActions::create()
/$file->replace()
.Filesystem\File
with custom $filename
propFilesystem\TmpFile
classWe would have a new class Filesystem\TmpFile extends File
that would add the special $filename
prop. This class could then override the filename()
, extension()
, is()
and name()
methods.
Also we would have to rewrite the following methods inside Filesystem\File
: The mime()
and type()
methods would use the faked extension()
internally. The sanitizeContents()
and validateContents()
methods would use the faked extension()
and mime()
internally.
Unsolved issue: There would be no way to use TmpFile
for an Image
object directly. For the upload use case, we'd at least need the match()
method to use Image::$validations
if the upload is an image.
TmpFile
with an override for match()
that supports Image
We could override the match()
method so that it accesses Image::$validations
if the fake $filename
is of type image.
Unsolved issue: The TmpFile
class would only be useful for upload handling in the core, but not generally useful:
Image
but don't be able to provide access to other Image
methods.Filesystem\File
besides Image
(every class would need to be supported explicitly in TmpFile
).TmpFile
as a proxy classTo solve the access to all Image
methods, we could make TmpFile
a proxy class that receives the underlying File
or Image
object as well as the $filename
.
Every method call that isn't overridden would be proxied to the asset object.
Unsolved issue: The overridden methods in the proxy won't be available to the underlying class, so we would need to override many of the underlying methods in the proxy even though their code would be unchanged.
We are a bit stuck at the moment.
The second potential solution for TmpFile
looks to be the most promising so far, but it's not perfect either.
Edit 30.12.2022: Updated summary to the state after merging #4943
We partly fixed the issue for 3.9.0. The uploaded file is cleaned from the tmp
dir on successful uploads, but so far the file stays in tmp
on failed uploads (see the issue summary above).
I think with https://github.com/getkirby/kirby/pull/6590 we have covered the majority of cases and all that we can cover without a lot of effort. Closing this/
Describe the bug
When you upload a file through PHP, it is usually temporarily stored in the
/tmp
folder until the script is finished. Since Kirby's upload method renames each file on upload, however, files pile up in the/tmp
folder until manually removed. This can easily end up in people no longer being able to upload files due to space issues.To Reproduce
Steps to reproduce the behavior:
/tmp
folder./tmp
folder and is not removed when the upload is finished.Expected behavior
Files should either not be renamed on upload or removed again afterward.
Kirby Version
Tested with 3.3.4
Additional context
https://forum.getkirby.com/t/uploading-images-keep-a-copy-of-the-original-in-phps-tmp-folder/17469/2