Arksine / moonraker

Web API Server for Klipper
https://moonraker.readthedocs.io
GNU General Public License v3.0
1.05k stars 406 forks source link

Can the file manager API also return a hash? #849

Open iandouglas opened 5 months ago

iandouglas commented 5 months ago

Is your feature request related to a problem? Please describe

In discussion with folks involved in the "FDM Monster" print farm software, they mentioned that when syncing a file over the network, OctoPrint will return a filename, file size, and a hash of the file, which can be used as a comparison for whether to re-copy a file over the network or not. They mentioned that Moonraker's API only returns the filename and file size, without a hash.

Describe the solution you'd like

Extend the file manager API to include a hash for any files returned in the response.

Describe alternatives you've considered

None at the moment.

Additional information

No response

Arksine commented 5 months ago

The upload endpoint accepts an optional SHA256 checksum in the form. The frontend should pre-calculate the hash and add it to the form. If the checksum doesn't match Moonraker will return a 422 error. This flow is generally superior over the requested flow (return a checksum after processsing), as prevents a file with a mismatched checksum from being written to the destination.

davidzwa commented 5 months ago

Hi,

The original question above was not intended to implement upload error prevention, but instead as a mechanism to prevent having to upload a file which already exists on Moonraker.

The idea is to optimize having to send 400MB files. The OP originally asked in DM about whether an approach like rsync was possible, and I had to fall back to API based approach. In that case knowing a checksum can prevent the client (the FDM Monster server in this case) to upload a big file (if its not new by only 1 bit) and then proceed immediately to print it.

In the proposed situation the file has to be uploaded (with checksum multipart field) after which Moonraker will (most likely) never throw an error as the new file is probably fine. Therefore the (large) upload is not prevented and no optimization is achieved.

Our feature request might have been misinterpreted a bit ;)

davidzwa commented 5 months ago

@iandouglas based on our latest conversation and change in requirements on my part, shall we close this issue?

Arksine commented 5 months ago

It looks like I did misinterpret the original request. While it sounds like you have reached an alternative solution, I thought it would be worth providing feedback on why I don't believe it would be a good idea to modify the return values for existing endpoints:

1) Moonraker attempts to avoid processing complete gcode files. There are exceptions (outlined below), generally speaking Moonraker attempts to read out a minimal part of the file in an effort to extract metadata as fast as possible. 2) Presuming Moonraker calculated a checksum for every gcode file, caching the checksum isn't necessarily reliable. A file could be modified after upload. 3) As mentioned above, Moonraker does process complete uploads in two scenarios: UFP uploads and when "object processing" is enabled and detected as necessary. Checksums aren't useful in either of these scenarios, as the the result is guaranteed to be different from the original file. That said, popular slicers now support native Klipper objects, so the latter scenario is less likely to occur.

FWIW, I would be willing to consider a new endpoint that calculates a checksum for a file on request. This way we avoid the major pitfalls outlined above.

davidzwa commented 5 months ago
  1. Moonraker attempts to avoid processing complete gcode files. There are exceptions (outlined below), generally speaking Moonraker attempts to read out a minimal part of the file in an effort to extract metadata as fast as possible.

And I completely agree with this from your point of view. I would do the same.

  1. Presuming Moonraker calculated a checksum for every gcode file, caching the checksum isn't necessarily reliable. A file could be modified after upload.

Agreed. If a file is changed on the root directory attached to Moonraker, that's up to the user to figure out (its simply out of my control). I hope to document this so the user is aware from the perspective of FDM Monster.

  1. As mentioned above, Moonraker does process complete uploads in two scenarios: UFP uploads and when "object processing" is enabled and detected as necessary. Checksums aren't useful in either of these scenarios, as the the result is guaranteed to be different from the original file. That said, popular slicers now support native Klipper objects, so the latter scenario is less likely to occur.

I see. Interesting to note.

FWIW, I would be willing to consider a new endpoint that calculates a checksum for a file on request. This way we avoid the major pitfalls outlined above.

It would be a nice thing. But let's get this question answered first:

One more question: is the checksum calculation done by POST /server/files/upload a heavy operation CPU wise (for the Moonraker server/host)? Or is this done in a streaming fashion while the upload is being processed (chunk by chunk) and therefore not of big impact? I want to be able to advise the users of FDM Monster appropriately whether checksums are wise for large files (400MB+)

Arksine commented 5 months ago

One more question: is the checksum calculation done by POST /server/files/upload a heavy operation CPU wise (for the Moonraker server/host)? Or is this done in a streaming fashion while the upload is being processed (chunk by chunk) and therefore not of big impact?

The checksum is calculated in a streaming fashion as the file is received. Its always calculated, regardless of whether or not the form includes the checksum argument, so there is no additional overhead. The checksum is calculated before the file is passed off to the metadata processor, so there it should always match the checksum calculated by the frontend before submitting the upload request.

davidzwa commented 5 months ago

Perfect! I suggest that we keep this issue open with a new title "Add API endpoint to recalculate file hash" and that it gets low priority. For my project at this point the feature is an edge case, but it might become important after the next half year when moonraker is fully supported and out of beta in FDM Monster.

If things change on my end I will let you know.