Agoric / agoric-sdk

monorepo for the Agoric Javascript smart contract platform
Apache License 2.0
326 stars 206 forks source link

arguing the security properties of xsnap (XS, SES, JS) #2224

Closed dckc closed 1 year ago

dckc commented 3 years ago

@erights , you asked me how I'd go about this... the question has been rattling around in my head and it's starting to take shape... let me know if I should move this under the Agoric org. during this early brainstorming stage. cc @warner @dtribble @kriskowal

dump of open tabs:

dckc commented 3 years ago

Hardware support? MPX in Intel Skylake

From What is memory safety? - The PL Enthusiast:

Pointers as capabilities

The definition I prefer, which I’ll now present, is inspired by Santosh Nagarakatte‘s work (done with several collaborators) on SoftBound, an approach for enforcing memory safety in C programs (which has inspired Intel hardware support, coming soon).

Intel MPX - Wikipedia citing "Intel Software Development Emulator". Intel. 2012-06-15:

Intel MPX was introduced as part of the Skylake microarchitecture.

grep -i mpx /proc/cpuinfo shows mpx, suggesting it's on my machine.

Intel® Memory Protection Extensions (Intel® MPX) support in the GNU toolchain says it's deprecated, and Wikipedia goes on to say "In practice, there have been too many flaws discovered in the design for it to be useful, and support has been deprecated or removed from most compilers and operating systems." But it still looks interesting, to me. Wikipedia doesn't note the connection to capabilities. I wonder how similar it is to CHERI.

dckc commented 3 years ago

Executable: 1.5M

2.7M for debug

agoric-sdk/packages/xsnap$ ls -lsh build/bin/lin/*/xsnap
2.7M -rwxrwxr-x 1 connolly connolly 2.7M Jan 23 20:03 build/bin/lin/debug/xsnap
1.5M -rwxrwxr-x 1 connolly connolly 1.5M Jan 23 20:03 build/bin/lin/release/xsnap

Compiled C: 2,084KB

agoric-sdk/packages/xsnap/build/tmp/lin/release/xsnap$ ls -sh *.o
8.0K xsAll.o         96K xsDataView.o    36K xsJSON.o       36K xsObject.o      28K xsScope.o
 60K xsAPI.o        120K xsDate.o        60K xsLexical.o    12K xsPlatforms.o   16K xsScript.o
8.0K xsArguments.o  8.0K xsDebug.o       48K xsMapSet.o    4.0K xsProfile.o     68K xsSnapshot.o
 80K xsArray.o       16K xsDefaults.o    20K xsMarshall.o   32K xsPromise.o    8.0K xsSourceMap.o
 60K xsAtomics.o     44K xsdtoa.o        24K xsMath.o       12K xsProperty.o    52K xsString.o
 64K xsBigInt.o      12K xsError.o       68K xsMemory.o     36K xsProxy.o       16K xsSymbol.o
4.0K xsBoolean.o     24K xsFunction.o    36K xsModule.o     32K xsRegExp.o     108K xsSyntaxical.o
 96K xsCode.o        24K xsGenerator.o   48K xsnap.o       292K xsre.o          60K xsTree.o
 48K xsCommon.o      24K xsGlobal.o      16K xsNumber.o     88K xsRun.o         32K xsType.o

agoric-sdk/packages/xsnap/build/tmp/lin/release/xsnap$ ls -s *.o >,siz
agoric-sdk/packages/xsnap/build/tmp/lin/release/xsnap$ python

Python 2.7.17 (default, Sep 30 2020, 13:38:04) 
>>> lines = open(',siz').readlines()
>>> sum([int(line.strip().split(None, 1)[0]) for line in lines if line.strip()])
2084

Dependencies: libc, libpthread only

The debug build also has some socket stuff but we no longer use that in production: https://github.com/Agoric/agoric-sdk/issues/2216

agoric-sdk/packages/xsnap$ ldd build/bin/lin/release/xsnap 
    linux-vdso.so.1 (0x00007ffe61391000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f57743c6000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f57741a7000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5773db6000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f5774ae5000)

agoric-sdk/packages/xsnap$ nm build/bin/lin/release/xsnap | grep 'U '
                 U acos@@GLIBC_2.2.5
                 U acosh@@GLIBC_2.2.5
                 U asin@@GLIBC_2.2.5
                 U asinh@@GLIBC_2.2.5
                 U atan2@@GLIBC_2.2.5
                 U atan@@GLIBC_2.2.5
                 U atanh@@GLIBC_2.2.5
                 U calloc@@GLIBC_2.2.5
                 U cbrt@@GLIBC_2.2.5
                 U cos@@GLIBC_2.2.5
                 U cosh@@GLIBC_2.2.5
                 U __ctype_tolower_loc@@GLIBC_2.3
                 U __errno_location@@GLIBC_2.2.5
                 U exit@@GLIBC_2.2.5
                 U exp@@GLIBC_2.2.5
                 U expm1@@GLIBC_2.2.5
                 U fclose@@GLIBC_2.2.5
                 U fdopen@@GLIBC_2.2.5
                 U feof@@GLIBC_2.2.5
                 U fflush@@GLIBC_2.2.5
                 U fgetc@@GLIBC_2.2.5
                 U floor@@GLIBC_2.2.5
                 U fmod@@GLIBC_2.2.5
                 U fopen@@GLIBC_2.2.5
                 U __fprintf_chk@@GLIBC_2.3.4
                 U fputc@@GLIBC_2.2.5
                 U fputs@@GLIBC_2.2.5
                 U fread@@GLIBC_2.2.5
                 U free@@GLIBC_2.2.5
                 U fwrite@@GLIBC_2.2.5
                 U gettimeofday@@GLIBC_2.2.5
                 U hypot@@GLIBC_2.2.5
                 U __isoc99_fscanf@@GLIBC_2.7
                 U __libc_start_main@@GLIBC_2.2.5
                 U localtime@@GLIBC_2.2.5
                 U log10@@GLIBC_2.2.5
                 U log1p@@GLIBC_2.2.5
                 U log2@@GLIBC_2.2.5
                 U log@@GLIBC_2.2.5
                 U __longjmp_chk@@GLIBC_2.11
                 U malloc@@GLIBC_2.2.5
                 U memcmp@@GLIBC_2.2.5
                 U __memcpy_chk@@GLIBC_2.3.4
                 U memcpy@@GLIBC_2.14
                 U memmove@@GLIBC_2.2.5
                 U memset@@GLIBC_2.2.5
                 U mktime@@GLIBC_2.2.5
                 U nearbyint@@GLIBC_2.2.5
                 U pow@@GLIBC_2.2.5
                 U __printf_chk@@GLIBC_2.3.4
                 U pthread_cond_destroy@@GLIBC_2.3.2
                 U pthread_cond_init@@GLIBC_2.3.2
                 U pthread_cond_signal@@GLIBC_2.3.2
                 U pthread_cond_timedwait@@GLIBC_2.3.2
                 U pthread_cond_wait@@GLIBC_2.3.2
                 U pthread_mutex_destroy@@GLIBC_2.2.5
                 U pthread_mutex_init@@GLIBC_2.2.5
                 U pthread_mutex_lock@@GLIBC_2.2.5
                 U pthread_mutex_unlock@@GLIBC_2.2.5
                 U pthread_self@@GLIBC_2.2.5
                 U puts@@GLIBC_2.2.5
                 U qsort@@GLIBC_2.2.5
                 U rand@@GLIBC_2.2.5
                 U realloc@@GLIBC_2.2.5
                 U __realpath_chk@@GLIBC_2.4
                 U round@@GLIBC_2.2.5
                 U _setjmp@@GLIBC_2.2.5
                 U sin@@GLIBC_2.2.5
                 U sinh@@GLIBC_2.2.5
                 U __snprintf_chk@@GLIBC_2.3.4
                 U __sprintf_chk@@GLIBC_2.3.4
                 U sqrt@@GLIBC_2.2.5
                 U __stack_chk_fail@@GLIBC_2.4
                 U __stpcpy_chk@@GLIBC_2.3.4
                 U strcat@@GLIBC_2.2.5
                 U strchr@@GLIBC_2.2.5
                 U strcmp@@GLIBC_2.2.5
                 U __strcpy_chk@@GLIBC_2.3.4
                 U strcpy@@GLIBC_2.2.5
                 U strerror@@GLIBC_2.2.5
                 U strlen@@GLIBC_2.2.5
                 U strncat@@GLIBC_2.2.5
                 U strncmp@@GLIBC_2.2.5
                 U strncpy@@GLIBC_2.2.5
                 U strrchr@@GLIBC_2.2.5
                 U strstr@@GLIBC_2.2.5
                 U strtol@@GLIBC_2.2.5
                 U tan@@GLIBC_2.2.5
                 U tanh@@GLIBC_2.2.5
                 U __vfprintf_chk@@GLIBC_2.3.4
                 U __vsnprintf_chk@@GLIBC_2.3.4

strcpy

I don't see any guarantee that strcpy in fxBuildModuleMap is memory-safe. TODO: check whether if (preparation && the->archive) is always false in xsnap.

erights commented 3 years ago

Good! Yes, please move to Agoric.

On Wed, Jan 20, 2021 at 7:14 AM Dan Connolly notifications@github.com wrote:

@erights https://github.com/erights , you asked me how I'd go about this... the question has been rattling around in my head and it's starting to take shape... let me know if I should move this under the Agoric org. during this early brainstorming stage. ...

-- Cheers, --MarkM

dckc commented 3 years ago

for reference, earlier discussions:

dckc commented 3 years ago

splint looks somewhat promising... with a few annotations, it might be able to make a pretty good memory-safety argument.

The defaults seem incompatible with the current XS idioms, but perhaps that could be addressed easily:

moddable/xs/sources]$ splint +posixlib -I ../includes/ -I ../platforms/ xsDate.c
Splint 3.1.2 --- 02 Dec 2020

../platforms/xsPlatform.h:81:26: #error unknown compiler
  Preprocessing error. (Use -preproc to inhibit warning)
   In file included from xsCommon.h:41,
                 from xsAll.h:41,
                 from xsDate.c:38
xsAll.h:1942:85: Invalid character in macro parameter name
   In file included from xsDate.c:38
xsAll.h:1942:85: Parameter list for #define is not parseable
xsAll.h:1957:88: Invalid character in macro parameter name
...
*** Cannot continue.

found while looking for CCured via SO clue.

SAFECode

SAFECode: Enforcing Alias Analysis for Weakly Typed Languages Dinakar Dhurjati, Sumant Kowshik, and Vikram Adve. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2006

https://news.ycombinator.com/item?id=17143237

[1705.07354] The Meaning of Memory Safety

Checked C looks active

Checked C: Making C Safe by Extension - Microsoft Research September 2018

https://github.com/Microsoft/checkedc 52d7a7f on Dec 18, 2020

https://github.com/Microsoft/checkedc-clang/releases 2020-07-31 2745f7d

dckc commented 3 years ago

fuzzing

I played around with AFL tonight. AFL: american fuzzy lop - a security-oriented fuzzer

This looks like the result of some fuzzing too:

SEGV at /xs/sources/xsBigInt.c:1182 · Issue #485 · Moddable-OpenSource/moddable

looking at other work by the reporter... kvenux/capflow: Isabelle proof for CapFlow

and from there: A Verified Capability-Based Model for Information Flow Security With Dynamic Policies

p.s. cf https://github.com/wasmerio/wasmer/tree/master/fuzz

dckc commented 3 years ago

getters and setters vs. static analysis

The main difference between SES and SESlight is that SES supports getters/setters and SESlight does not because they are not amenable to the analysis methods we considered practical and ...

-- Automated Analysis of Security-Critical JavaScript APIs found via A Taxonomy of Security Issues - Agoric

erights commented 3 years ago

Yes. And JS has changed even more since that article was written.

dckc commented 3 years ago

to prevent escape from vat:

@erights wondered about in-vat mutual-suspicion boundaries:

Formalizing the LLVM Intermediate Representation for Verified Program Transformations esbmc/esbmc: The efficient SMT-based bounded model checker static-analysis-engineering/CodeHawk-C: CodeHawk C Analyzer: sound static analysis of memory safety (undefined behavior)

setjmp / longjump

Support for setjmp / longjmp · Issue #2625 · rust-lang/rfcs

frama-C doesn't support it.

dckc commented 3 years ago

esmbc -- Efficient model checking with CI techniques

This looks like a particularly good match for XS:

Continuous Verification of Large Embedded Software using SMT-Based Bounded Model Checking

The project is active; the maintainers responded to an issue within a day: https://github.com/esbmc/esbmc/issues/355#issuecomment-771552103

Map2Check -- fuzzing and symbolic execution

esbmc contributors overlap with https://github.com/hbgit/Map2Check , which combines fuzzing and symbolic execution.

tried the release from late 2019; not promising:

~/opt/map2check$ ./map2check  --target-function fxBuildDate ~/projects/moddable/build/tmp/lin/debug/xst/xsDate.bc
Adopting z3 solver... 
Started Map2Check
std::bad_alloc

nesCheck -- good performance on subset of C

adding "embedded" to our search terms turns up other interesting work...

nesCheck is aimed at a constrained subset of C, but the results are interesting:

Through extensive evaluation benchmarks – both on specifically constructed programs and the standard TinyOS applications – we showed the effectiveness of nesCheck in enforcing memory protection while minimizing the runtime performance overhead (0.84% on energy, 5.3% on code size, up to 8.4% on performance, and 16.7% on RAM). -- Midi et. a. 2017

dckc commented 3 years ago

I'm experimenting with coverity scan: https://scan.coverity.com/projects/dckc-moddable?tab=overview

hmm... more involve than my first impression suggested.

dckc commented 3 years ago

discovered https://github.com/google/oss-fuzz by way of https://twitter.com/LazyFishBarrel today

fuzzing/libFuzzerTutorial.md at master · google/fuzzing

Google Online Security Blog: Improving open source security during the Google summer internship program

dckc commented 3 years ago

https://github.com/trailofbits/manticore , a Symbolic execution tool, can analyze WebAssembly . I wonder if analysis of XS compiled to WASM would be fruitful.

dckc commented 3 years ago

tangentially related... (i.e. above the xsnap layer... or is it?)

@erights writes:

We need to introduce and document and maintain a careful distinction of the names of static direct hardeners. harden obviously but now also Far and for the moment Data are all in this category. Likely makeKind too though I have not looked lately.

We can add static amountMath and its static methods, or some of its static methods to that list. When we write a static verifier for hardening, it must know about this category.

dckc commented 3 years ago

Another fuzzing expert is working on XS! Go @rain6851 !

https://github.com/Moddable-OpenSource/moddable/issues/580 thru https://github.com/Moddable-OpenSource/moddable/issues/587

cc @erights

dckc commented 3 years ago

safety, metering via WASM?

Crazy idea? Kill two birds with one stone by compiling xsnap to WASM:

cc @michaelfig @warner @erights @dtribble @kriskowal

zmanian commented 3 years ago

General information here

dtribble commented 3 years ago

safety, metering via WASM?

Crazy idea? Kill two birds with one stone by compiling xsnap to WASM:

  • metering
  • fail-stop safety: if XS isn't memory safe, WASM should make it fail-stop

We considered this option (ty Zaki for the detailed options).

WRT metering, that means that the actual metering costs now derive the the entire tool chain of compilation of C to WASM, and the presents as a mysterious mapping to user-specified code. It ensures that nothing escapes the metering, but it would also make it difficult to have select things with custom metering (e.g., if we wanted to charge fixed costs for some class of crypto operations to not penalize multisig).

WRT fail-stop memory-safety, we are already in position to have each vat run in a separate process, so WASM adds nothing there. It could conceivably provide stronger safety for multiple-vats within a single process, but for the cost of checking that taller tool chain, we could add other memory safety sanity checks.

I would like to see XS or something like it directly in Rust someday :).

dckc commented 3 years ago

WRT metering, that means that the actual metering costs now derive the the entire tool chain of compilation of C to WASM,

Yes, I thought of that soon after I posted the idea. Any change at the C layer would change metering results.