Closed corentinllorca closed 4 years ago
Here are the crypto folder for the main crypto libraries
Libgcrypt: there’s a folder containing all ciphers https://github.com/gpg/libgcrypt/tree/master/cipher
OpenSSL: crypto folder https://github.com/openssl/openssl/tree/master/crypto
LibSodium: the lib sodium folder contains a ton of crypto sub-folder https://github.com/jedisct1/libsodium/tree/master/src/libsodium
NaCl: main folder contains crypto subfolders https://github.com/krig/nacl
Nettle: crypto files scattered within the main folder https://git.lysator.liu.se/nettle/nettle
wolf crypto: files scattered within this folder https://github.com/wolfSSL/wolfssl/tree/master/wolfcrypt/src
ARMmBed: https://github.com/ARMmbed/mbed-crypto/tree/37b5c831b41cd41456caa979f1444234c51e4c51/library
There is a limited amount of crypto algorithm that are being used currently, so pretty much all the libraries will have the same algos in them. However, the implementations differ (at least in terms of code structure) so it would make an interesting dataset.
However, in terms of amount of files, the number is likely to be rather small (few hundreds at most)
@corentinllorca, As you said, there are quite a lot of undesired files in there (helpers, wrappers etc...). It is difficult to actually estimate the number of files that should be removed (depends a lot on the library). However, I see 2 issues with using WindRiver for that:
For certain packages, quite a lot of functions implemented using Assembly (so .S format) that had to be dropped.
A few random examples of files that are harder to classify:
Also, at least for crypto libraries, you get quite a lot of files with a well-written docstring which explains exactly what the file is for. Should we consider it as being a data leak?
810 code files extracted
Merged and done
See #1
WARNING: there might be some false positives in there. Not every file in a crypto library implements crypto. Either look "by hand" or make it go through Wind-River.