hacl-star / merkle-tree

A verified Merkle Tree, built as a standalone project on top of EverCrypt
6 stars 6 forks source link

Undefined Behaviour after deserialising Merkle tree #2

Open jumaffre opened 3 years ago

jumaffre commented 3 years ago

We recently turned on more sanitizer checks for CCF (i.e. -fsanitize=undefined,address -fno-omit-frame-pointer -fno-sanitize-recover=all -fno-sanitize=function). We've observed a runtime error after deserialising a Merkle tree and appending two hashes to it:

../3rdparty/hacl-star/evercrypt/MerkleTree.c:177:17: runtime error: null pointer passed as argument 2, which is declared to never be null

A minimal repro is (using CCF's thin C++ wrapper around EverCrypt hash library):

// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the Apache 2.0 License.

#define DOCTEST_CONFIG_IMPLEMENT_WITH_MAIN
#include "crypto/hash.h"
#include "ds/logger.h"

#include <doctest/doctest.h>
#include <hacl-star/evercrypt/MerkleTree.h>
#include <string>

TEST_CASE("Test for san")
{
  merkle_tree* src_tree;
  merkle_tree* dst_tree;

  // Note: crypto::Sha256Hash is a thin C++ wrapper around evercrypt_sha256
  crypto::Sha256Hash first_hash = {};
  src_tree = mt_create(first_hash.h.data());

  // Insert one additional hash in first tree
  std::string data = fmt::format("to_be_hashed");
  crypto::Sha256Hash hash(data);
  uint8_t* h = hash.h.data();
  if (!mt_insert_pre(src_tree, h))
  {
    throw std::logic_error("Precondition to mt_insert violated");
  }
  mt_insert(src_tree, h);

  // Serialise first tree
  std::vector<uint8_t> serialised(mt_serialize_size(src_tree));
  mt_serialize(src_tree, serialised.data(), serialised.capacity());

  // Deserialise in second tree
  dst_tree =
    mt_deserialize(serialised.data(), serialised.size(), mt_sha256_compress);

  // Insert two more hashes in second tree
  for (size_t i = 0; i < 2; i++)
  {
    std::string data = fmt::format("to_be_hashed: {}", i);
    crypto::Sha256Hash hash(data);
    uint8_t* h = hash.h.data();
    if (!mt_insert_pre(dst_tree, h))
    {
      throw std::logic_error("Precondition to mt_insert violated");
    }
    // Second insertion raises
    //     ../3rdparty/hacl-star/evercrypt/MerkleTree.c:177:17: runtime error:
    //     null pointer passed as argument 2, which is declared to never be null
    // /usr/include/string.h:43:28: note: nonnull attribute specified here
    // SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior
    // ../3rdparty/hacl-star/evercrypt/MerkleTree.c:177:17 in
    mt_insert(dst_tree, h);
  }

  {
    mt_free(src_tree);
    mt_free(dst_tree);
  }
}

Backtrace is:

#0  insert___uint8_t_ (vec=..., v=0x603000000fa0 "\331Y\253\004\006\333b\250\266I\253\370=xm\034\252\373\336\063\245\256\202\203V\323\361d") at ../3rdparty/hacl-star/evercrypt/MerkleTree.c:177
hacl-star/hacl-star#1  0x00000000007ec1ee in insert___uint8_t__uint32_t (rv=..., v=0x603000000fa0 "\331Y\253\004\006\333b\250\266I\253\370=xm\034\252\373\336\063\245\256\202\203V\323\361d") at ../3rdparty/hacl
-star/evercrypt/MerkleTree.c:982
hacl-star/hacl-star#2  0x00000000007ebbb6 in insert_copy___uint8_t__uint32_t (rg=..., cp=0x7ddf50 <hash_copy>, rv=..., v=0x7fffffffbce0 "\331Y\253\004\006\333b\250\266I\253\370=xm\034\252\373\336\063\245\256\202\203V\323\361d\024") at ../3rdparty/hacl-star/evercrypt/MerkleTree.c:1002
hacl-star/hacl-star#3  0x00000000007dbe58 in insert_ (hsz=32, lv=2, j=0, hs=..., acc=0x7fffffffbce0 "\331Y\253\004\006\333b\250\266I\253\370=xm\034\252\373\336\063\245\256\202\203V\323\361d\024", hash_fun=0x7d
7ca0 <mt_sha256_compress>) at ../3rdparty/hacl-star/evercrypt/MerkleTree.c:1029
hacl-star/hacl-star#4  0x00000000007dc356 in insert_ (hsz=32, lv=1, j=1, hs=..., acc=0x7fffffffbce0 "\331Y\253\004\006\333b\250\266I\253\370=xm\034\252\373\336\063\245\256\202\203V\323\361d\024", hash_fun=0x7d
7ca0 <mt_sha256_compress>) at ../3rdparty/hacl-star/evercrypt/MerkleTree.c:1046
hacl-star/hacl-star#5  0x00000000007dc356 in insert_ (hsz=32, lv=0, j=3, hs=..., acc=0x7fffffffbce0 "\331Y\253\004\006\333b\250\266I\253\370=xm\034\252\373\336\063\245\256\202\203V\323\361d\024", hash_fun=0x7d
7ca0 <mt_sha256_compress>) at ../3rdparty/hacl-star/evercrypt/MerkleTree.c:1046
hacl-star/hacl-star#6  0x00000000007cc605 in MerkleTree_Low_mt_insert (mt=0x607000000250, v=0x7fffffffbce0 "\331Y\253\004\006\333b\250\266I\253\370=xm\034\252\373\336\063\245\256\202\203V\323\361d\024") at ../
3rdparty/hacl-star/evercrypt/MerkleTree.c:1065
hacl-star/hacl-star#7  0x00000000007cc1fd in mt_insert (mt=0x607000000250, v=0x7fffffffbce0 "\331Y\253\004\006\333b\250\266I\253\370=xm\034\252\373\336\063\245\256\202\203V\323\361d\024") at ../3rdparty/hacl-s
tar/evercrypt/MerkleTree.c:263
hacl-star/hacl-star#8  0x00000000005671da in _DOCTEST_ANON_FUNC_8 () at ../src/node/test/test_merkle_san.cpp:55
hacl-star/hacl-star#9  0x000000000055f7b2 in doctest::Context::run (this=0x7fffffffe140) at ../3rdparty/doctest/doctest.h:6112
hacl-star/hacl-star#10 0x0000000000565eaa in main (argc=1, argv=0x7fffffffe2f8) at ../3rdparty/doctest/doctest.h:6196

At this point:

(gdb) p vs
$1 = (uint8_t **) 0x0

Please let me know if you need any additional detail on this. For now, we've added MerkleTree.c to our sanitizer blacklist.

wintersteiger commented 3 years ago

@protz this is about the non-null annotation on the memcpy arguments, which clang reported some time ago. I thought you fixed this, but something about it doesn't seem to be working. A simple example is

let hash_copy #_ s src dst =
  B.blit src 0ul dst 0ul s

in MerkleTree.Low.Datastructures.fst, which extracts to

static void hash_copy(uint32_t s, uint8_t *src, uint8_t *dst)
{
  memcpy(dst, src, s * sizeof (uint8_t));
}

where src/dst are allowed to be NULL because of the definition of the hash type:

type hash (#hsz:hash_size_t) = b:B.buffer uint8 { B.len b = hsz \/ B.g_is_null b }

So, I think this memcpy needs to be guarded by a NULL-check. Without the sanitizer, clang just calls memcpy, which happens to do the right thing even when the arguments are NULL, but that's unspecified behaviour.

msprotz commented 3 years ago

Hi Christoph,

There is a massive rewrite underway that fixes this. However, it's very labor-intensive, and many algorithms need to be rewritten to avoid the insertion of null-checks. Do you have time? We could use some help porting the EverCrypt layers and the hash/hmac/hkdf algorithms to avoid ugly C code.

Relevant Slack thread: https://everestexpedition.slack.com/archives/C4237009M/p1596467100365000

Thanks,

Jonathan

wintersteiger commented 3 years ago

That's great, thanks for the pointer! I don't have a lot of time, but I'll definitely look at the Merkle tree code.

msprotz commented 1 year ago

Same as hacl-star/hacl-star#327 (cross-referencing the two issues)