pterodactyl / panel

Pterodactyl® is a free, open-source game server management panel built with PHP, React, and Go. Designed with security in mind, Pterodactyl runs all game servers in isolated Docker containers while exposing a beautiful and intuitive UI to end users.
https://pterodactyl.io
Other
6.75k stars 1.72k forks source link

Downloading a file that is being written to causes Content-Length to differ from actually received amount of bytes #4563

Open RainOrigami opened 1 year ago

RainOrigami commented 1 year ago

Current Behavior

When requesting a file download URL from /api/client/servers/{server}/files/download and downloading that file, the supplied Content-Length is larger than the amount of data that is being sent by the server when that file is being written to during the download process.

This has to be tested with large files or a throttled connection to guarantee that the file is changed during download. It also has to be tested by using tools that follow HTTP RFC in this case and verify the received amount of bytes against the Content-Length header, such as Postman. Please use such a tool to replicate this error. Other tools, like using a browser, may be more lenient to mismatches of this kind and will not cause an error.

It is unclear to me why Content-Length is larger than the transmitted amount of data. When the file grows during transmission, after Content-Length has been sent, then Content-Length should be smaller than the transmitted amount of data. Something does not add up, but the error can be reproduced.

Since supplying a Content-Length is a promise, implementations like Postman or .net HttpClient will error when not receiving the promised amount of data. This should therefore be classified as a bug.

Expected Behavior

When requesting a file for download, the amount of bytes specified in the Content-Length of the file download should match exactly the amount of bytes sent to the client, or no Content-Length should be sent at all if the value can not be guaranteed to be exact.

Otherwise software, such as Postman or .net-based (eg. HttpClient.GetAsync) will abort with a failure instead of downloading the file.

HTTP RFC describes Content-Length in https://www.rfc-editor.org/rfc/rfc2616.html#section-4.4 :

When a Content-Length is given in a message where a message-body is allowed, its field value MUST exactly match the number of OCTETs in the message-body. HTTP/1.1 user agents MUST notify the user when an invalid length is received and detected.

The current behaviour violates this directive and the expected behaviour is that this directive is followed.

Steps to Reproduce

Prerequisites:

  1. Server of test-egg installed, not started (egg: egg-file-download-error-test.zip)
  2. Postman test collection: Pterodactyl.postman_collection.json.zip
  3. Postman collection variables: base_url set to pterodactyl base url (eg. https://pterodactyl.file.properties), server_id set to server id of test-egg server, no need to set change download_url. Supply bearer token API key in authorization

Test:

  1. Run the test server
  2. Postman run Get Download URL
  3. Postman run Download Test File
  4. After Postman error, stop the test server to prevent the file from growing too large

This causes an error in Postman: image

To verify that this is caused by the file being written to during download, try steps 2 and 3 but with the server stopped.

If this does not cause an error in your Postman, wait until the file has grown to a few 100s of MB so that the download takes long enough for the test server to write to the file during download.

Instead of using the postman collection, you can do these steps manually:

  1. Postman invoke GET https://pterodactyl.file.properties/api/client/servers/{serverid}/files/download?file=test.log
  2. Postman invoke GET of the URL you received as result from step 1

Or instead of using Postman, you can also use C# script (net 6 console app):


using System.Net.Http.Headers;
using System.Text.Json;

string baseUrl = "https://pterodactyl.file.properties",
    serverId = "testserverid",
    apiKey = "ptlc_your_api_key_here";

using (HttpClient httpClient = new())
{
    httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);

    // Get download URL
    HttpResponseMessage downloadUrlResponse = await httpClient.GetAsync($"{baseUrl}/api/client/servers/{serverId}/files/download?file=test.log");
    string downloadUrlResponseContent = await downloadUrlResponse.Content.ReadAsStringAsync();
    JsonDocument downloadUrlJsonDocument = JsonDocument.Parse(downloadUrlResponseContent);
    string downloadUrl = downloadUrlJsonDocument.RootElement.GetProperty("attributes").GetProperty("url").GetString()!;

    // Download file, no check
    HttpResponseMessage downloadFileResponse = await httpClient.GetAsync(downloadUrl); // Exception here
    string fileContent = await downloadFileResponse.Content.ReadAsStringAsync();
}

Which will result in an exception: image

Or more detailed, with Content-Length validation:

    // Download file, with check
    HttpResponseMessage downloadFileResponse = await httpClient.GetAsync(downloadUrl, HttpCompletionOption.ResponseHeadersRead);
    long expected = downloadFileResponse.Content.Headers.ContentLength!.Value;

    using StreamReader reader = new(await downloadFileResponse.Content.ReadAsStreamAsync());
    long totalRead = 0;

    char[] buffer = new char[1024];

    int bytesRead;
    try
    {
        while ((bytesRead = reader.Read(buffer, 0, 1024)) != 0)
        {
            totalRead += bytesRead;
        }
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex.Message); // The response ended prematurely, with at least .... additional bytes expected.
        Console.WriteLine($"Expected {expected} bytes from Content-Length, received {totalRead} bytes from body. Body is missing {expected - totalRead} bytes!");
    }

    Console.WriteLine("End.");

Which prints out:

The response ended prematurely, with at least 2514 additional bytes expected.
Expected 244566482 bytes from Content-Length, received 244563968 bytes from body. Body is missing 2514 bytes!
End.

This confirms that the exception from HttpClient is indeed referring to the expected Content-Length as the calculated missing bytes are equal to the reported missing bytes from the exception.

Panel Version

1.10.4

Wings Version

1.6.4

Games and/or Eggs Affected

No response

Docker Image

No response

Error Logs

No response

Is there an existing issue for this?

matthewpi commented 1 year ago

It is unclear to me why Content-Length is larger than the transmitted amount of data. When the file grows during transmission, after Content-Length has been sent, then Content-Length should be smaller than the transmitted amount of data. Something does not add up, but the error can be reproduced.

Is the file being appended to or replaced? If the file is appended to then we can fix this by limiting the amount of bytes written to the client, however the content could still end-up being corrupted depending on the type of file being requested.

If the file is being truncated or modified to have less content, then there isn't a good way to work around this error. We could do something like pad the response, but that behavior could be considered even worse then having your client tell you exactly what the problem is. We could possibly add an option to skip sending a Content-Length header, as not sending one causes most clients to just read until the stream ends. The only downside of that is there would be no way to calculate a download progress or to pre-allocate buffers/disk space for downloads without fetching the file stats separately.

RainOrigami commented 1 year ago

Is the file being appended to or replaced? If the file is appended to then we can fix this by limiting the amount of bytes written to the client, however the content could still end-up being corrupted depending on the type of file being requested.

In the case of the test egg, content is only being appended to the target file. Which is why I don't understand why there is less data transmitted than specified in Content-Length. At the start of transmission, file has size n, which is put into the header. After transmitting a bit, file size increases to n+m, so Content-Length n should be smaller than n+m, but it is not.

Limiting the amount of data written to what is sent by Content-Length is a good approach for the case where more data is sent than specified, but I have not managed to replicate such a behaviour.

If the file is being truncated or modified to have less content, then there is a good way to work around this error. We could do something like pad the response, but that behavior could be considered even worse then having your client tell you exactly what the problem is.

I am not sure how other software, such as apache, would handle this case. For smaller files I think it may be sufficient to just read them completely into ram and transmit that. Since it does not change when changing the file on disk, it will definitely be complete. But this may be an issue for larger files (out of memory).

We could possibly add an option to skip sending a Content-Length header, as not sending one causes most clients to just read until the stream ends. The only downside of that is there would be no way to calculate a download progress or to pre-allocate buffers/disk space for downloads without fetching the file stats separately.

This would probably be the easiest way. Correct me if I'm wrong but I believe github does this, or did this in the past when downloading releases. So it's not out of the ordinary.