containers / composefs

a file system for mounting container images
GNU General Public License v2.0
421 stars 29 forks source link

Added multithreading support for mkcomposefs #269

Closed divineaugustine closed 5 months ago

divineaugustine commented 5 months ago

This resolves #249 and results in 10x improvement for digest calculation and file copy.

Please refer the below document for details about the issue and the change. I would like to know your thoughts. threading.md

divineaugustine commented 5 months ago

Can you please run clang-format? Basically let's get past the superficial bits before we can dig into the threading.

Thank you for checking. I have run clang-format on my changes via vscode formatting. Is that what you meant? FYI: I did rebase and force push to fix DCO error.

divineaugustine commented 5 months ago

Sorry for the inconvenience one header file was missed out from formatting. I have corrected that as well. Hopefully this solves all the formatting issues from my changeset.

alexlarsson commented 5 months ago

I'll have a look at the code, but can you please rebase this and squash the fixups so the end result is a minimal set of independent changes. Say one for the liibrary changes, and one of the mkcomposefs use of it.

alexlarsson commented 5 months ago

So, lots of comments. Make sure you read them all first, because some will make the other ones perhaps not needed. In particular, the propsed change to lcfs_node_set_from_content() will make most of the library API changes unnecessary.

divineaugustine commented 5 months ago

So, lots of comments. Make sure you read them all first, because some will make the other ones perhaps not needed. In particular, the propsed change to lcfs_node_set_from_content() will make most of the library API changes unnecessary.

Thank you for the comments. I have addressed those and also squashed the commits to only 3 separate ones as you suggested. As i don´t know who should mark the comments as resolved, i just added reply to those.

divineaugustine commented 5 months ago

Various nitpicks, but overall this looks good to me.

Thank you for reviewing it. I have addressed those and pushed the changes.

alexlarsson commented 5 months ago

Some more minor feedback, but I would like @cgwalters or @giuseppe to also review this.

alexlarsson commented 5 months ago

Also, if you're able to talk about it, I'd love to hear what you're using composefs for. Its good to know what your users are doing.

divineaugustine commented 5 months ago

Also, if you're able to talk about it, I'd love to hear what you're using composefs for. Its good to know what your users are doing.

Can you give some details @https://github.com/r0l1 ? He is the right person to comment on this.

divineaugustine commented 5 months ago

left a comment, could you please sign the commits with your real name?

My real name is Divin and i have signed off the commits with it. Did you mean including initials Divin OA or even initials expanded Divin Ookken Athappan?

giuseppe commented 5 months ago

My real name is Divin and i have signed off the commits with it. Did you mean including initials Divin OA or even initials expanded Divin Ookken Athappan?

yes please use the expanded version

cgwalters commented 5 months ago

Yes, separate PRs for things like this that are logically separate please.

Thank you for the links. Reading through https://lwn.net/Articles/846403/ i think it is safer to rollback copy_file_range to older read/write or perhaps address this as a separate PR as this does not have a direct dependency to the original multi-threading change. What do you say? I am very new to linux and i was looking for better apis to copy a file without user mode buffer.

divineaugustine commented 5 months ago

Yes, separate PRs for things like this that are logically separate please.

I will create a separate PR for it. Can i also squash up the current commits in this PR to just 2 ? One for library and other for mkcomposefs. It was done before but then as part of review comments another set of 4 were added again.

cgwalters commented 5 months ago

Yes, I think one or two commits here is best.

cgwalters commented 5 months ago

I'm assuming your goal is faster mkcomposefs overall (on a server side build system, right?). Using multiple threads is one way to do that. However, another way is caching - ostree does this thing where the build system can retain a cache of (device, inode number) :arrow_right: checksum; in theory we could offer the same.

The downside is of course that if something happens to mutate a file in the cache, you get incorrect checksums. But this is pretty easily mitigated, and actually a great thing with composefs is such a failure should be detected at runtime.

(These are complementary approaches to be clear; threading also makes sense)

divineaugustine commented 5 months ago

Yes, I think one or two commits here is best.

Squashed the commits to just 2 (one for library and one for mkcomposefs) and siged it off with my full name. Added an issue #274 for copy_file_range

r0l1 commented 5 months ago

@alexlarsson we are using composefs for our custom OS and as docker container management layer. We have a small go initramfs including composefs and created a simple A/B OS which powers our devices at Wahtari and nLine. The devices are interlinked with a simple mesh like end-to-end encrypted network and this enables us to push updates (composefs store diff). I have some ideas how to improve the composefs store management and would like to discuss those ideas soon.

I really appreciate your work and follow your ideas since Glick2.

@cgwalters yes, this PR is for an internal speedup of the OS build process. Thanks for pointing out the cache idea.

alexlarsson commented 5 months ago

@r0l1 That sounds cool! Nice that it works for you.

divineaugustine commented 5 months ago

I'm assuming your goal is faster mkcomposefs overall (on a server side build system, right?). Using multiple threads is one way to do that. However, another way is caching - ostree does this thing where the build system can retain a cache of (device, inode number) ➡️ checksum; in theory we could offer the same.

The downside is of course that if something happens to mutate a file in the cache, you get incorrect checksums. But this is pretty easily mitigated, and actually a great thing with composefs is such a failure should be detected at runtime.

(These are complementary approaches to be clear; threading also makes sense)

Thank you for these alternative-approaches. My objective was to make mkcomposefs run fast in any machine as part of our build process. It was simple and straight forward for me to add threads in mkcomposefs which does not require any facilitation from the end user. I am new to linux tooling and I would require some consultations here to fully understand these alternative approaches.

alexlarsson commented 5 months ago

I'm assuming your goal is faster mkcomposefs overall (on a server side build system, right?). Using multiple threads is one way to do that. However, another way is caching - ostree does this thing where the build system can retain a cache of (device, inode number) ➡️ checksum; in theory we could offer the same.

This only works though if you image compose produces hardlinked copies of files you previously committed (otherwise the device/inode of the new files will not match something old in the cache). This happens regularly in ostree because of how it is set up, but that isn't necessarily always true, and it causes some level of extra pain to enforce when you build using ostree. (With things like rofiles-fuse, etc.)

Not saying it is bad, but its not always applicable.

But anyway, if you do use a setup like that where you end up with hardlinked files, then the best way to store and quickly access the cached digest is probably to just enable fs-verity on these files. In other words, we should probably have lcfs_node_set_from_content() start by asking the kernel for the fs-verity digest in case it is set, rather than computing it ourselves.

cgwalters commented 5 months ago

In other words, we should probably have lcfs_node_set_from_content() start by asking the kernel for the fs-verity digest in case it is set, rather than computing it ourselves.

Yes agreed, this is an obvious step.

alexlarsson commented 5 months ago

I think this looks good enough now. I looked through it again and did some local tests, seems to work well, and the public library APIs are sane.