busyloop / rucksack

Attach static files to your compiled crystal binary and access them at runtime.
MIT License
53 stars 5 forks source link

Rucksack is sometimes very slow #8

Open threez opened 1 year ago

threez commented 1 year ago

Problem

I want to use rucksack to embed static files with the binary. I use no other external binary but rucksack.

The directory only includes one file:

$ ls -alh ./src/static
total 2.0M
drwxr-xr-x 2 vil vil 4.0K Apr 10 10:54 .
drwxr-xr-x 6 vil vil 4.0K Apr 10 10:24 ..
-rw-r--r-- 1 vil vil 2.0M Apr 10 10:55 iconoir.css

The application is fast on start up using crystal run ... without rucksack (1 sec).

start end duration
Mon 10 Apr 2023 11:40:08 AM CEST Mon 10 Apr 2023 11:40:36 AM CEST 28 sec
Mon 10 Apr 2023 11:41:42 AM CEST Mon 10 Apr 2023 11:41:57 AM CEST 15 sec
Mon 10 Apr 2023 11:46:06 AM CEST Mon 10 Apr 2023 11:46:07 AM CEST 1 sec
Mon 10 Apr 2023 11:46:32 AM CEST Mon 10 Apr 2023 11:46:33 AM CEST 1 sec
Mon 10 Apr 2023 11:46:49 AM CEST Mon 10 Apr 2023 11:46:51 AM CEST 2 sec

Way I measured:

startts = {{ `date`.stringify }}
{% for name in `find ./src/static -type f`.split('\n') %}
  rucksack({{name}})
{% end %}
endts = {{ `date`.stringify }}
puts startts, endts

Manually doing the find:

$ time find ./src/static -type f  
./src/static/iconoir.css
find ./src/static -type f  0.00s user 0.00s system 88% cpu 0.001 total

Environment

Crystal 1.7.3 [d61a01e18] (2023-03-07)

LLVM: 13.0.1
Default target: x86_64-unknown-linux-gnu

Linux XYZ 5.15.90.1-microsoft-standard-WSL2 #1 SMP Fri Jan 27 02:56:13 UTC 2023 x86_64 GNU/Linux
threez commented 1 year ago

Sometimes I also see:

Error: while requiring "./static"
Error: Unable to get file info: '.../lib/rucksack/src/.rucksack_packer.cr': No such file or directory

Maybe this is related?

threez commented 1 year ago

May workaround for the moment:

require "http/server/handler"
require "mime"

# enable rucksack only on release builds
{% if flag?(:release) %}
  require "rucksack"

  {% for name in `find ./src/static -type f`.split('\n') %}
    rucksack({{name}})
  {% end %}
{% end %}

class StaticHandler
  include HTTP::Handler

  def call(context)
    if context.request.path.starts_with?("/static")
      path = "./src#{context.request.path}"
      if type = MIME.from_filename?(path)
        context.response.content_type = type
      end

      {% if flag?(:release) %}
        begin
          rucksack(path).read(context.response.output)
          return
        rescue Rucksack::FileNotFound
        end
      {% else %}
        begin
          File.open(path) do |file|
            IO.copy(file, context.response.output)
            return
          end
        rescue File::NotFoundError
        end
      {% end %}
    end

    call_next(context)
  end
end
m-o-e commented 1 year ago

Hmm.

Is it consistently slow or "sometimes slow, sometimes fast"?

And is the final binary (with rucksack attached) also slow, or does the slowness only affect crystal run?

The error seems to suggest broken read-after-write semantics.

At compile-time .rucksack_packer.cr is written, executed, then deleted.

According to your error message the file apparently did not immediately become visible for reading after it was generated.

Are you perhaps compiling on a network filesystem (samba / NFS)? Otherwise I'm afraid this is likely something WSL related (could also be that shelling out, which rucksack does at compile time, is intermittently slow / unreliable on there?).

I don't have Windows / WSL to test, so I'm afraid can only defer to the note in the README here ("Windows is not supported"). However if your workaround sidesteps the problem for you then that looks like a good solution. 🤞

threez commented 1 year ago

Is is not deterministic, so yes, sometimes slow and sometimes "fast". However it is not really fast. The final binary is fast. Only the compile time is slow.

Interesting, regarding the read-after-write. Im using crystaline could this maybe interfer with the creation of the file? So maybe it is a concurrent use issue? As I'm coding while waiting for the build I "feel" it is related to my work. Does rucksack "wait" somehow and could another compile break it?

I'm not working on samba/NFS but on the wsl disk:

none on /mnt/wsl type tmpfs (rw,relatime)
/dev/sdc on / type ext4 (rw,relatime,discard,errors=remount-ro,data=ordered)
none on /mnt/wslg type tmpfs (rw,relatime)
/dev/sdc on /mnt/wslg/distro type ext4 (ro,relatime,discard,errors=remount-ro,data=ordered)
drvfs on /mnt/c type 9p (rw,noatime,dirsync,aname=drvfs;path=C:\;uid=1000;gid=1000;symlinkroot=/mnt/,mmap,access=client,msize=262144,trans=virtio)

So it should not be the issue. I also moved the whole project to /tmp but still the section with rucksack still takes about 2 seconds. I think that is rather long...

m-o-e commented 1 year ago

Crystalline might indeed be related (I don't know how it works under the hood, but if it starts multiple compiles concurrently then that could mess with things).

2 seconds sounds about right, the crystal run just takes a moment. But I agree for interactive compiling that's not great, so your workaround makes sense in any case.

This part of rucksack is def worth optimising at some point but I haven't gotten around to it, yet. As an alternative to disabling it completely during dev-builds it could also just compile the rucksack_packer binary only once and then re-use it.

(I think back in the day I was reluctant to pollute the filesystem with a temp-binary but in hindsight it seems like a low hanging fruit worth plucking...)