arduino / arduino-cli

Arduino command line tool
https://arduino.github.io/arduino-cli/latest/
GNU General Public License v3.0
4.28k stars 371 forks source link

build_cache.path not honored for "sketches" #2668

Open egnor opened 1 month ago

egnor commented 1 month ago

Describe the problem

Per the documentation the config key build_cache.path (or environment $ARDUINO_BUILD_CACHE_PATH, etc) should cause temp files to go in the designated directory instead of $TMP/arduino which is actually global on the system.

However, while setting that config does redirect the cores/ directory, it does NOT redirect sketches/

To reproduce

Expected behavior

INSTEAD, I see something like this

% ls -l /tmp/test_dir
total 4
drwxr-xr-x 3 egnor egnor 4096 Jul 16 19:58 cores/

% ls -l /tmp/arduino
total 4
drwxr-xr-x 3 egnor egnor 4096 Jul 16 19:58 sketches/

So, cores/ got moved but sketches/ did not.

Arduino CLI version

arduino-cli Version: 1.0.0 Commit: 05c9852a Date: 2024-06-12T14:13:32Z

Operating system

Linux

Operating system version

Ubuntu 24.04 LTS, kernel 6.8.0-36-generic

Additional context

Code that might be relevant https://github.com/arduino/arduino-cli/blob/b4f8849cae83199d05e14aa15a2fc050fd5aed1b/commands/service_compile.go#L434 https://github.com/arduino/arduino-cli/blob/b4f8849cae83199d05e14aa15a2fc050fd5aed1b/internal/arduino/sketch/sketch.go#L285

I think the latter one is probably the important one (the first one is just for purging, but it should be made consistent). Both should really be using the common build cache dir.

Issue checklist

alessio-perugini commented 1 month ago

@egnor this is intended. The --build-cache-path will be responsible for writing only the compiled cores, that different sketch compilations can re-use. The idea under that flag is to store only what can be reused for compilations that span from different sketches. The --build-path will write the content of the sketch build (the one usually found at /tmp/arduino/sketches/hash/*). When you that flag the cli will use the provided path to start writing all the build-related files. However, it doesn't put the files under a parent folder (sketches/hash). The logic behind this is that you may want to control a specific compilation where to store the build-related files.

Do you think it makes sense? Do you think we should explain it better in the doc? We're happy to hear your feedback on this :smile:

egnor commented 1 month ago

@alessio-perugini hmm, the documentation and/or terminology could probably be improved? These are the various temp directories I can find

flag setting default contents
--build-cache-path build_cache.path $TMP/arduino core subdir with object files (etc) for cores
--build-path ??? $TMP/arduino/sketches/[hash] object files (etc) for sketch & libraries
- directories.downloads ~/.arduino15/staging staging directory for downloaded files
- - {directories.data}/tmp temp unpack of packages, temp installs of packages referenced by profiles
- - $TMP/package-xxx temp unpack of downloaded packages
- - $TMP/xxx-library-index-download temp download of library index
- - $TMP/xxx temp git clone of packages

For me, something like this would be way less surprising:

flag setting default contents
directories.temp system temp + /arduino root of sundry temp dirs
--build-cache-path build_cache.path user cache dir + /arduino ALL cacheable build artifacts (in cores/ and sketches/)
--sketch-cache-path or --build-path sketch.build_cache {build_cache.path}/sketches/[hash] override sketch & libraries artifacts
- - {directories.temp}/whatever other misc temp things

...and in any case it should be clarified; the "Configuration" page just lists build_cache.path as "the path to the build cache" with no further elaboration.

Use cases:

cmaglie commented 1 month ago

The temp folder inside .arduino15/tmp has a reason: it allows the unpacking of zip archives on the same filesystem where .arduino15 lives so they can be moved to the destination folder (usually under .arduino15/packages/...), the move operation has many advantages, is atomic and very fast, almost instantaneous. If we use the global /tmp the move is not always possible since /tmp may belong to a different filesystem: in that case, we must fall back on copying the content of the extracted archive instead of moving it, with a big performance hit. I would not change this for the above reasons. ❌

The other sundry temp files in /tmp/..., outside of /tmp/arduino/..., should all be temporary and removed just after the corresponding operation of the Arduino CLI finishes. If this is not the case it's a bug and we should fix it (and looking at your table I think you have plenty of cases to report where this happens :-)). BTW I agree that they could be moved inside /tmp/arduino for consistency. ✅

I like also the proposed changes for --build-cache-path and --build-path. ✅

The setting var sketch.build_cache IMHO doesn't make much sense because it's sketch-specific, it's kind of strange to have a global setting for that, so I will not add this setting to the configuration. ❌

And now thinking about this, I'm wondering if we should ignore the build cache altogether when the user specifies a --build-path, and let the compiler rebuild the core inside the build path instead.

egnor commented 1 month ago

Yeah, on reflection I feel like there's a difference between "true temporary" and "cache" files?

True temporary files can be made with TempDir or whatever the right method is. I'd actually suggest they NOT be put in /tmp/arduino because there can be permissions issues with multiple users? And yeah if sometimes they get created in a different place to ensure moveability that makes sense. I don't think these need to be configurable, though they should use a standard function that obeys $TMP_DIR or equivalent for the platform. I think most of this is fine as is.

Cache files are the issue here. They shouldn't be in /tmp/ at all, but in ~/.cache/ (Linux), ~/Library/Caches/ (Mac), or some sort of AppData directory (Windows). Probably there's some golang utility to find this standard location? And it should for sure be customizable, because there are use cases for wanting to isolate the. cache, put it somewhere specific, erase it, etc..

I hear what you're saying about how it doesn't make sense to have an environment variable/config key for a sketch-specific cache path (as opposed to the overall cache root). Honestly I think just relocating the overall cache root seems fine. I can't think of a use case where you want to use the global core cache but move around the sketch build cache (though I'm sure someone could think of one)?

cmaglie commented 1 month ago

Cache files are the issue here. They shouldn't be in /tmp/ at all, but in ~/.cache/ (Linux), ~/Library/Caches/ (Mac), or some sort of AppData directory (Windows). Probably there's some golang utility to find this standard location? And it should for sure be customizable, because there are use cases for wanting to isolate the. cache, put it somewhere specific, erase it, etc..

We may use os.UserCacheDir for that, it seems to fit the purpose.

I hear what you're saying about how it doesn't make sense to have an environment variable/config key for a sketch-specific cache path (as opposed to the overall cache root). Honestly I think just relocating the overall cache root seems fine.

Ok, I've drafted a proposal here #2673. Let's continue the discussion there so we have something to work on.