darktable-org / darktable

darktable is an open source photography workflow application and raw developer
https://www.darktable.org
GNU General Public License v3.0
9.86k stars 1.15k forks source link

Feature request: Lossless compression support (i.e. Zstandard) #17086

Open Cadynum opened 4 months ago

Cadynum commented 4 months ago

Perhaps surprisingly, compressing RAW-files with Zstandard results in a significant reduction of file size. For example, I compressed one directory with ARW-files, and achieved a resulting size of 73.08% ( 3.46 GiB => 2.53 GiB).

Darktable could transparently decompress files with compression suffixes, like img123.arw.zst, and allow users to save space without sacrificing any quality.

pomoke commented 4 months ago

Which ARW version/variant does your camera produce? Compression ratio may differ. I use a Sony original a7r which produces only lossy ARWs.

Cadynum commented 4 months ago

The camera is Sony Alpha 7 III. The format:

$ exiftool -FileFormat -SonyModelID -RAWFileType file.ARW 
File Format                     : ARW 2.3.5
Sony Model ID                   : ILCE-7M3
RAW File Type                   : Uncompressed RAW
pomoke commented 4 months ago

Not that good, but works well for uncompressed RAWs. Tested with zstd level 5. Results are as follow. https://gist.github.com/pomoke/624e110042042dcc678a524550fdb1b5

ralfbrown commented 4 months ago

Implementing this will require either decompressing the file to a temporary location and then loading that, or modifying every image loader (rawspeed, libraw, libtiff, ...) to handle decompression while reading the file. The former means an additional round-trip to disk for the decompressed image while opening it.

Consider using a compressed filesystem instead. I've been using btrfs for about a decade, and it provides transparent compression which can optionally be applied at the level of individual files if write performance in general is a concern. It even uses Zstandard by default in recent versions....

github-actions[bot] commented 2 months ago

This issue has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.

pbo-linaro commented 2 months ago

I observed the same thing on my side, with a Sony A7III. Raw size comes from 48MB to 30MB (average) using xz -0, which is better than gzip in my tests.

pbo-linaro commented 2 months ago

Even though btrfs can provide this kind of compression on the fly, I still think there would be value to have that built in darktable. Modifying every library seems like a complicated approach compared to simply decompressing the file in memory, and read the raw from there.

Other systems than Linux, or Linux without btrfs could benefit from this too. As well as backup systems (if it does not do compression on the fly).

victoryforce commented 2 months ago

Other systems than Linux, or Linux without btrfs could benefit from this too. As well as backup systems (if it does not do compression on the fly).

Non-Linux systems also have transparent compression on their file systems, which further reduces the priority and relative value of implementing this feature.

In Windows it is very simple: https://www.makeuseof.com/windows-11-file-compression-guide/

Transparent compression is also supported in macOS (in HFS+ and APFS), but I'm not an Apple person, so I won't go into details.

pbo-linaro commented 2 months ago

I understand your point, and that it seems to be preferred for the darktable project to solve this by using filesystems compression features, compared to do it at file level.

After using something similar in the past, I found it a bit hard to deal with the difference between the sum of your files size, and the real size of fs, as well as performance issues, so I prefer not use it nowadays. Thus, I'll simply keep compressing/decompressing my raw manually for now :).

You can probably close this issue if this is not a feature that is desired in the project. Thanks!

victoryforce commented 2 months ago

After using something similar in the past, I found it a bit hard to deal with the difference between the sum of your files size, and the real size of fs, as well as performance issues, so I prefer not use it nowadays. Thus, I'll simply keep compressing/decompressing my raw manually for now :).

Transparent compression at the file system level has many more advantages. You can have thumbnails in the file manager. All programs can work with your files, not just those that implemented on-the-fly decompression of files with the .zst extension.

Cadynum commented 2 months ago

I think there is some value having this feature, for example when paying for backup per GB, or when (like in my case) it would be difficult to change the existing filesystem (ext4) to something more featureful like btrfs. I understand, however, that it's not prioritized given the options available.

github-actions[bot] commented 1 week ago

This issue has been marked as stale due to inactivity for the last 60 days. It will be automatically closed in 300 days if no update occurs. Please check if the master branch has fixed it and report again or close the issue.