janmojzis / tinyssh

TinySSH is small server (less than 100000 words of code)
Creative Commons Zero v1.0 Universal
1.41k stars 75 forks source link

How to build a smaller tinysshd? #32

Open philanc opened 5 years ago

philanc commented 5 years ago

My objective is to build a tinysshd (static build, with musl libc) as small as possible. Performance is not an issue. The ssh client would be a recent ssh from OpenSSH.

  1. Is it possible to easily build tinysshd with only ed25519 / chacha-poly1305 crypto? (ie. no nistp256ecdsa, no aes)

  2. I have seen an old HackerNews post mentioning TweetNacl ( https://news.ycombinator.com/item?id=7727738 ) -- Is it still possible to build tinysshd with tweetnacl?

Thanks for Tinysshd!

Phil

janmojzis commented 5 years ago

Hello,

  1. branch nooldcrypto https://github.com/janmojzis/tinyssh/tree/nooldcrypto ... and using -Os unstriped binary is 1.6x smaller (compiled on my laptop)

  2. TweetNacl was used in early version of tinyssh and there is few problems

    • TweetNacl doesn't have crypto_stream_chacha20, crypto_hash_sha256
    • TweetNacl has very,very,very slow crypto_onetimeauth_poly1305
janmojzis commented 5 years ago

or branch noprecomputedtables https://github.com/janmojzis/tinyssh/tree/noprecomputedtables unstriped binary is ~1.67x smaller (compiled on my laptop) but with slowdown penalty, crypto_sign_ed25519_sign is ~1.55x slower

janmojzis commented 5 years ago

using both optimizations: removing aes/nistp256/hmacsha256 + no precomputed tables the binary can be smaller ~1.9x

janmojzis commented 5 years ago

https://github.com/janmojzis/tinyssh/tree/noprecomputedtables merged into main

philanc commented 5 years ago

Thanks a lot!

I built the old master, and the last noprecomputedtables + nooldcrypto, both with -Os, static, with musl libc, stripped (on linux x86-64, built with make-tinysshcc.sh): old tinysshd: 306,272 bytes new tinysshd: 141,728 bytes

On regular servers, people will keep using OpenSSH anyway, but for administering tiny servers or IoT platforms where raw performance is usually not an issue, this last tinysshd, with only one set of modern crypto and no pre-computed tables could be very useful.

I would suggest to offer this last configuration at least as a build option in the master branch.

Thanks again!

Avamander commented 5 years ago

It might be worth trying to try and use LTO if size matters.

philanc commented 5 years ago

It might be worth trying to try and use LTO if size matters.

My previous lowest size was 141,728 stripped, built with make-tinysshcc.sh and conf-cc =

/opt/musl/bin/musl-gcc -static -Os -fomit-frame-pointer -funroll-loops

To build with LTO, I had to change all instances of 'ar' in make-tinysshcc.sh into 'gcc-ar', so that ar knows about some lto plugin(s). Then I built with conf-cc =

/opt/musl/bin/musl-gcc -static -Os -fomit-frame-pointer -flto

Now tinysshd stripped is 129,352 bytes. 12KB less. Not bad!

Thanks for the hint.

janmojzis commented 5 years ago

branch nooldcrypto also merged into main

alexmyczko commented 3 years ago

You can make it even smaller with upx

-rwxr-xr-x 1 root root 122184 Sep 5 2019 tinysshd* -rwxr-xr-x 1 root root 51368 Mar 22 10:38 tinysshd.upx* upx on tinysshd -rwxr-xr-x 1 root root 46872 Mar 22 10:38 tinysshd.upx.lzma* upx --lzma on tinysshd

not sure how small you could go with using tinysshd.c and #!/usr/bin/tcc -run

philanc commented 3 years ago

You can make it even smaller with upx

Nice idea, thanks! - It is interesting when storage space is an issue. Of course it doesn't help (and makes things worse) when RAM space is the issue.

not sure how small you could go with using tinysshd.c and #!/usr/bin/tcc -run

An intriguing option! I will check how it works.

alexmyczko commented 3 years ago

UCL has been designed to be simple enough that a decompressor can be implemented in just a few hundred bytes of code. UCL requires no additional memory to be allocated for decompression, a considerable advantage that means that a UPX packed executable usually requires no additional memory.

from: https://en.wikipedia.org/wiki/UPX?wprov=sfti1

philanc commented 3 years ago

UCL requires no additional memory to be allocated for decompression, a considerable advantage that means that a UPX packed executable usually requires no additional memory.

Great! I built upx and tried it on the current tinysshd: tinysshd 153,960 bytes (static build, with musl libc, stripped) tinysshd.upx 72,036 bytes (same, compressed with upx)

Yay! Thanks for the suggestion.

Is there a simple way to build without the PQ crypto?