janet-lang / spork

Various Janet utility modules - the official "Contrib" library.
MIT License
117 stars 35 forks source link

Improve Base64 performance #185

Closed Jakski closed 2 months ago

Jakski commented 2 months ago

Thank you for maintaining Spork. It's very useful in everyday scripting and Base64 implementation allowed me to quickly prototype my ideas.

Upon profiling my code I've discovered that most of time is spent in Base64 functions and I'm not even processing that big files. For example encoding like this:

(import spork/base64)
(let [start (os/clock :monotonic)]
  # Imagine a file with 4000 lines each 80 characters long, so about 316 kilobytes
  (base64/encode (string/repeat @"A" (* 80 4000)))
  (print (- (os/clock :monotonic) start)))

takes 5.7 seconds on my machine and decoding like this:

(import spork/base64)
(let [start (os/clock :monotonic)]
  # Imagine a file with 4000 lines each 80 characters long, so about 316 kilobytes
  (base64/decode (string/repeat @"A" (* 80 4000)))
  (print (- (os/clock :monotonic) start)))

takes 2.7 seconds on my machine.

I tried to profile the current implementation, but finding a single bottleneck is very hard. It seems like the most time is spent in converting arrays between sextets, octets, decimals and binaries.

I've re-implemented Base64 with the following ideas:

As the result decoding dropped to 0.19 second and encoding to 0.07 second.

bakpakin commented 2 months ago

Looks good, I haven't used the base64 module so I didn't realize it was so slow.

Maybe we could just implement it in C for performance?

Jakski commented 2 months ago

Implementing in C sounds good. I'll mark this PR as a draft for now and prepare required changes.

sogaiu commented 2 months ago

It is nice to be able to use the base64 module as pure-janet as it can just be copy-pasted into a project where there is no use of C.

I based an implementation of base32 off of it here.

May be it can be kept somewhere -- as an example may be? Or may be there can be a kind of pure-janet collection of some sorts?

pepe commented 2 months ago

Fyi I have cjanet implementation in https://git.sr.ht/~pepe/gp/tree/master/item/cjanet/codec.janet

Jakski commented 2 months ago

@sogaiu I see your point. I've created a separate package with pure Janet implementation https://github.com/janet-lang/pkgs/pull/66

Latest commits in this PR include optimized C version. I don't have much experience with Janet C API, so any feedback is welcome.

bakpakin commented 2 months ago

This looks pretty good @Jakski , are you going to make any more changes or can I merge it? The code looks pretty clean and the C-API usage looks pretty good. As a convention, for encoding functions I often add an optional buffer to pass in so you can encode directly into an existing buffer.

Jakski commented 2 months ago

I see. I don't plan any more changes, so it's ready for merging.

pepe commented 2 months ago

I guess https://github.com/janet-lang/spork/blob/master/spork/init.janet#L4 needs to be removed. And maybe add the native module at the bottom of the same file?

sogaiu commented 2 months ago

I would have thought so too, along with something around here.

(use spork) seems to work for me oddly enough (with the latest installed), however, doing (use ./spork) from the root directory of a cloned spork repository gives the following here:

(use ./spork)
error: could not find module ./base64:
    spork/base64.jimage
    spork/base64.janet
    spork/base64/init.janet
    spork/base64.so
  in require-1 [boot.janet] on line 3017, column 20
  in import* [boot.janet] on line 3061, column 15
  in thunk [spork/init.janet] (tail call) on line 4, column 1
  in dofile [boot.janet] (tail call) on line 2992, column 7
  in source-loader [boot.janet] on line 3004, column 15
  in require-1 [boot.janet] on line 3028, column 18
  in import* [boot.janet] (tail call) on line 3061, column 15
pepe commented 2 months ago

Are you sure you do not have some remnants of the old version? I will be AFK for at least two days, but I encountered the problem in fresh environment

sogaiu commented 2 months ago

I did jpm uninstall to check, so I think so, but I can check again.

sogaiu commented 2 months ago

Below is a log of what I did.

May be I'm missing something...

$ jpm uninstall
removing /home/user/.local/lib/janet/spork
removing /home/user/.local/bin/janet-format
removing /home/user/.local/bin/janet-netrepl
removing /home/user/.local/lib/janet/spork/json.so
removing /home/user/.local/lib/janet/spork/json.meta.janet
removing /home/user/.local/lib/janet/spork/json.a
removing /home/user/.local/lib/janet/spork/rawterm.so
removing /home/user/.local/lib/janet/spork/rawterm.meta.janet
removing /home/user/.local/lib/janet/spork/rawterm.a
removing /home/user/.local/lib/janet/spork/crc.so
removing /home/user/.local/lib/janet/spork/crc.meta.janet
removing /home/user/.local/lib/janet/spork/crc.a
removing /home/user/.local/lib/janet/spork/utf8.so
removing /home/user/.local/lib/janet/spork/utf8.meta.janet
removing /home/user/.local/lib/janet/spork/utf8.a
removing /home/user/.local/lib/janet/spork/tarray.so
removing /home/user/.local/lib/janet/spork/tarray.meta.janet
removing /home/user/.local/lib/janet/spork/tarray.a
removing /home/user/.local/lib/janet//tarray.h
removing /home/user/.local/lib/janet/spork/zip.so
removing /home/user/.local/lib/janet/spork/zip.meta.janet
removing /home/user/.local/lib/janet/spork/zip.a
removing /home/user/.local/lib/janet/spork/cmath.so
removing /home/user/.local/lib/janet/spork/cmath.meta.janet
removing /home/user/.local/lib/janet/spork/cmath.a
removing /home/user/.local/lib/janet/spork/base64.so
removing /home/user/.local/lib/janet/spork/base64.meta.janet
removing /home/user/.local/lib/janet/spork/base64.a
removing manifest /home/user/.local/lib/janet/.manifests/spork.jdn
Uninstalled.
$ janet
Janet 1.35.2-a5d6b228 linux/x64/gcc - '(doc)' for help
repl:1:> (use spork)
error: could not find module spork:
    /home/user/.local/lib/janet/spork.jimage
    /home/user/.local/lib/janet/spork.janet
    /home/user/.local/lib/janet/spork/init.janet
    /home/user/.local/lib/janet/spork.so
  in require-1 [boot.janet] on line 3017, column 20
  in import* [boot.janet] (tail call) on line 3061, column 15
repl:2:> 
$ jpm install
generating /home/user/.local/lib/janet/.manifests/spork.jdn...
Installed as 'spork'.
copying spork to /home/user/.local/lib/janet...
installing bin/janet-format to /home/user/.local/bin/janet-format
installing bin/janet-netrepl to /home/user/.local/bin/janet-netrepl
copying build/spork/json.so to /home/user/.local/lib/janet/spork...
copying build/spork/json.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/json.a to /home/user/.local/lib/janet/spork...
copying build/spork/rawterm.so to /home/user/.local/lib/janet/spork...
copying build/spork/rawterm.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/rawterm.a to /home/user/.local/lib/janet/spork...
copying build/spork/crc.so to /home/user/.local/lib/janet/spork...
copying build/spork/crc.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/crc.a to /home/user/.local/lib/janet/spork...
copying build/spork/utf8.so to /home/user/.local/lib/janet/spork...
copying build/spork/utf8.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/utf8.a to /home/user/.local/lib/janet/spork...
copying build/spork/tarray.so to /home/user/.local/lib/janet/spork...
copying build/spork/tarray.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/tarray.a to /home/user/.local/lib/janet/spork...
copying src/tarray.h to /home/user/.local/lib/janet/...
copying build/spork/zip.so to /home/user/.local/lib/janet/spork...
copying build/spork/zip.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/zip.a to /home/user/.local/lib/janet/spork...
copying build/spork/cmath.so to /home/user/.local/lib/janet/spork...
copying build/spork/cmath.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/cmath.a to /home/user/.local/lib/janet/spork...
copying build/spork/base64.so to /home/user/.local/lib/janet/spork...
copying build/spork/base64.meta.janet to /home/user/.local/lib/janet/spork...
copying build/spork/base64.a to /home/user/.local/lib/janet/spork...
$ janet
Janet 1.35.2-a5d6b228 linux/x64/gcc - '(doc)' for help
repl:1:> (use spork)
@{_ @{:value } argparse/argparse @{:private true} base64/decode @{:private true} base64/encode @{:private true} crc/make-variant @{:private true} crc/named-variant @{:private true} cron/check @{:private true} cron/next-timestamp @{:private true} cron/parse-cron @{:private true} ev-utils/go-nursery @{:private true} ev-utils/join-nursery @{:private true} ev-utils/multithread-service @{:private true} ev-utils/nursery @{:private true} ev-utils/pcall @{:private true} ev-utils/pdag @{:private true} ev-utils/pmap @{:private true} ev-utils/pmap-full @{:private true} ev-utils/pmap-limited @{:private true} ev-utils/spawn-nursery @{:private true} ev-utils/wait-cancel @{:private true} fmt/*user-indent-2-forms* @{:private true} fmt/format @{:private true} fmt/format-file @{:private true} fmt/format-print @{:private true} generators/concat @{:private true} generators/cycle @{:private true} generators/drop @{:private true} generators/drop-until @{:private true} generators/drop-while @{:private true} generators/filter @{:private true} ...}

FWIW, the first thing that comes up for git log for my local spork clone is:

commit d024557895f327d5527b1ecdd063f71372e9a97e (HEAD -> master, origin/master, origin/HEAD)
Merge: e598ef4 dc1cc19
Author: Calvin Rose <calsrose@gmail.com>
Date:   Sun Jun 23 11:58:13 2024 -0700

    Merge pull request #185 from Jakski/faster-base64
sogaiu commented 2 months ago

If I install using (bundle/install ".") and then (use spork), I do get an error:

(use spork)
error: could not find module ./base64:
    /home/user/.local/lib/janet/spork/base64.jimage
    /home/user/.local/lib/janet/spork/base64.janet
    /home/user/.local/lib/janet/spork/base64/init.janet
    /home/user/.local/lib/janet/spork/base64.so
  in require-1 [boot.janet] on line 3017, column 20
  in import* [boot.janet] on line 3061, column 15
  in thunk [/home/user/.local/lib/janet/spork/init.janet] (tail call) on line 4, column 1
  in dofile [boot.janet] (tail call) on line 2992, column 7
  in source-loader [boot.janet] on line 3004, column 15
  in require-1 [boot.janet] on line 3028, column 18
  in import* [boot.janet] (tail call) on line 3061, column 15
Jakski commented 2 months ago

(use spork) seems to work for me oddly enough (with the latest installed), however, doing (use ./spork) from the root directory of a cloned spork repository gives the following here:

I believe this is caused by different layout of installed modules vs local compilation artifacts. When you run jpm build compiled artifacts are placed in local ./build/ directory, but when you do jpm install then *.janet and *.so files are placed in the same directory(~/.asdf/installs/janet/ref-a5d6b2283834422a9fa9e79b5c7ad9b932b52568/lib/janet/spork/ in my case). You can access compiled modules directly with (use ./build/spork/base64).

Yet I don't understand why do we import native and non-native dependencies differently in https://github.com/janet-lang/spork/blob/master/spork/init.janet#L4 It seems that both versions work the same, once module is installed.

If I install using (bundle/install ".") and then (use spork), I do get an error:

I confirm. It happens to me as well with the latest Janet. I've just caught up with discussion from https://github.com/janet-lang/janet/issues/1444 and it seems to me that we will need to support both methods of installation. As far as I understand bundle/ functions are not supposed to read project.janet(correct me on this one). They use https://github.com/janet-lang/spork/blob/d024557895f327d5527b1ecdd063f71372e9a97e/bundle/init.janet instead.

sogaiu commented 2 months ago

Yet I don't understand why do we import native and non-native dependencies differently in https://github.com/janet-lang/spork/blob/master/spork/init.janet#L4 It seems that both versions work the same, once module is installed.

I don't know either.

FWIW, looks like the pattern of doing the imports differently may have started here.

bundle/ functions are not supposed to read project.janet

I think that's correct.

AFAIU, project.janet is a jpm-only thing and bundle/* is an "out-of-the-box" janet thing.

sogaiu commented 2 months ago

Just adding a note here in case someone faces a situation where a pure-Janet piece gets replaced with a native one and they are looking for possible ways forward.


I found that janet-playground had broken as it relied on being able to make-image the result of (require "spork").

I believe this worked in the days when there were no native modules in spork (at least it worked in mid-2022), but at present, it doesn't.

Here is some related detail. There seemed to be an easy work-around in that case -- just (require "spork/fmt") instead.

The reason I bring this up in this PR is that a pure-Janet piece got replaced with a native one and had it been spork/fmt, the specific work-around mentioned above would not have sufficed.

Other work-arounds that could be considered include: