apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.48k stars 1.01k forks source link

Running tests uses 50.1GB on Ubuntu #11105

Open samuelcolvin opened 1 week ago

samuelcolvin commented 1 week ago

Describe the bug

I just cloned datafusion and tried cargo t on my ubuntu desktop, to check things were working properly.

It crashed.

I restarted, and it seems datafusion is using 50.1GB to just run tests.

The examples directory seems to be the biggest culpret.

image

To Reproduce

Just clone and run cargo t on Ubuntu (no idea if this is limited to linux).

Expected behavior

Humm, I guess in principle this isn't a show stopper, but seems somewhat unfortunate.

If there's an easy/low impact way to reduce disk usage, it might be useful.

Additional context

No response

jayzhan211 commented 6 days ago

I got 28GB after cargo clean and cargo t on macbook

Screenshot 2024-06-25 at 5 23 02 PM
jcsherin commented 6 days ago

On Ubuntu,

$ lsb_release -rc
Release:    22.04
Codename:   jammy

After cargo clean and running cargo t:

$ du -h -d2 target
4.0K    target/tmp
6.1G    target/debug/incremental
12M target/debug/.fingerprint
337M    target/debug/build
16G target/debug/deps
28G target/debug/examples
50G target/debug
50G target
samuelcolvin commented 5 days ago

Not sure what I've done (except run tests more times):

image

Update, not using 103GB.

Seeing this nice error:

...
   Compiling datafusion v39.0.0 (/home/samuel/code/datafusion/datafusion/core)
error: failed to write to `/home/samuel/code/datafusion/target/debug/deps/rmetalvEs2G/lib.rmeta`: No space left on device (os error 28)

error: could not compile `datafusion` (lib) due to 1 previous error
jayzhan211 commented 4 days ago

The more you run, the more artifacts in /target is. cargo clean is all you need.

alamb commented 3 days ago

Yeah, I think this obscene use of temp space is an artifact of rust (rather than an artifact of datafusion specifically)