mozilla / sccache

Sccache is a ccache-like tool. It is used as a compiler wrapper and avoids compilation when possible. Sccache has the capability to utilize caching in remote storage environments, including various cloud storage options, or alternatively, in local storage.
Apache License 2.0
5.75k stars 542 forks source link

sccache should help debug distributed compilation issues #346

Open hferreiro opened 5 years ago

hferreiro commented 5 years ago

I followed DistributedQuickstart.md to setup a distributed build but no jobs are being compiled in my remote server. sccache should provide a verbosity parameter or some log output to help me debug the issue.

aidanhs commented 5 years ago

During development I ran the scheduler and server with the environment variable RUST_LOG=sccache=trace and I would probably do the same for a production setup. At minimum I would strongly recommend info level logging.

Typically you should then at least be able to tell if the local daemon is hitting the scheduler, and if so where it's going wrong.

If there's nothing in the scheduler logs, you need to enable logging on your local daemon - it's a longstanding issue that this isn't trivial with sccache. I suggest stopping any possible running daemon with sccache --stop-server and starting it again in the foreground with RUST_LOG=sccache=trace SCCACHE_START_SERVER=1 SCCACHE_NO_DAEMON=1 sccache. Running a single compilation will then tell you why it failed to submit the job to the scheduler.

Let me know if this helps and I'll clean it up and add it to the documentation.

hferreiro commented 5 years ago

The problem is in the server/scheduler connection. I guess it may be related to not having an http/s server in front of the scheduler?

TRACE 2018-12-19T09:09:48Z: sccache::dist::http::server: Performing heartbeat
ERROR 2018-12-19T09:09:48Z: sccache::dist::http::server: Failed to send heartbeat to server: Error 404 (Headers={"date": "Wed, 19 Dec 2018 09:09:48 GMT", "server": "Apache/2.4.25 (Debian)", "content-length": "310", "content-type": "text/html; charset=iso-8859-1"}): <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /api/v1/scheduler/heartbeat_server was not found on this server.</p>
<hr>
<address>Apache/2.4.25 (Debian) Server at 192.168.10.88 Port 80</address>
</body></html>

Is there any way to make it work without any additional server?

aidanhs commented 5 years ago

Hmm so:

You can try it out by setting the scheduler public_addr to listen on all interfaces, e.g. 0.0.0.0:10600. Then you can point the server to the public IP of the scheduler, e.g. http://1.2.3.4:10600 (note http rather than https).

hferreiro commented 5 years ago

After fixing that, I'm getting this error:

TRACE sccache::cmdline: parse
DEBUG sccache::config: Attempting to read config file at "/home/hferreiro/.config/sccache/config"
TRACE sccache::commands: Command::Compile { "../../third_party/llvm-build/Release+Asserts/bin/clang", [...]
TRACE sccache::commands: connect_or_start_server(4226)
TRACE sccache::client: connect_to_server(4226)
TRACE sccache::commands: run_server_process
TRACE sccache::cmdline: parse
DEBUG sccache::config: Attempting to read config file at "/home/hferreiro/.config/sccache/config"
TRACE sccache::commands: Command::InternalStartServer
TRACE sccache::client: connect_with_retry(4226)
TRACE sccache::client: connect_to_server(4226)
TRACE sccache::commands: do_compile
TRACE sccache::commands: request_compile: Compile(Compile { exe: [...]
TRACE sccache::client: ServerConnection::request
TRACE sccache::client: ServerConnection::request: sent request
TRACE sccache::client: ServerConnection::read_one_response
TRACE sccache::client: Should read 8 more bytes
TRACE sccache::client: Done reading
DEBUG sccache::commands: Server sent CompileStarted
TRACE sccache::client: ServerConnection::read_one_response
error: failed to execute compile 
caused by: error reading compile response from server
caused by: Failed to read response header
caused by: failed to fill whole buffer
aidanhs commented 5 years ago

That log looks like the local daemon crashed and you'll need to run it with logging per https://github.com/mozilla/sccache/issues/346#issuecomment-448395537, e.g. sccache --stop-server && RUST_LOG=sccache=trace SCCACHE_START_SERVER=1 SCCACHE_NO_DAEMON=1 sccache.

Then run the compile again and see what happens!

hferreiro commented 5 years ago

I've got this error message:

thread '<unnamed>' panicked at 'failed to create toolchain', src/compiler/c.rs:387:13

icecc-create-env is available from the icecream fedora package at /usr/bin.

aidanhs commented 5 years ago

Ah! I think in the master of sccache a change has been merged in to use a custom toolchain packager for C, so you don't need icecc-create-env.

rahulbansal16 commented 3 years ago

@aidanhs I am trying to start the server in the foreground for the windows client. I have set the path variables. RUST_LOG=sccache=trace SCCACHE_START_SERVER=1 SCCACHE_NO_DAEMON=1

If there's nothing in the scheduler logs, you need to enable logging on your local daemon - it's a longstanding issue that this isn't trivial with sccache. I suggest stopping any possible running daemon with sccache --stop-server and starting it again in the foreground with RUST_LOG=sccache=trace SCCACHE_START_SERVER=1 SCCACHE_NO_DAEMON=1 sccache. Running a single compilation will then tell you why it failed to submit the job to the scheduler.

Running the command ./target/debug/sccache.exe throws the error

sccache: No command specified sccache 0.2.16-alpha.0

Running the command ./target/debug/sccache.exe --dist-status provides information about the no of CPUs connected.

but compiling via cargo build still happens locally.

How can I set the client daemon to the foreground to debug the issue?