Closed ETERNALBLUEbullrun closed 6 months ago
I'm not sure exactly what you're suggesting here...
suggest
Just suggest to match Virustotal's static analysis/sandboxes, plus use artificial neurons (CNS) to better secure us. Original post now has examples of this.
This module is an interface to ClamAV for NodeJS. Doing more than that would be out of the scope if it's use case. Thanks for your interest, though.
https://github.com/Cisco-Talos/clamav/issues/1206 developers asked for pull request. Lots to do. Anyone else want to do this?
Repurposed from https://swudususuwu.substack.com/p/howto-produce-better-virus-scanners ("Allows all uses") Static analysis + sandbox + CNS = 1 second (approx) analysis of new executables (protects all app launches,) but caches reduce this to less than 1ms (just cost to lookup
ResultList::hashes
, which isstd::unordered_set<decltype(sha2(const FileBytecode &))>
; a hashmap of hashes).Licenses: allows all uses ("Creative Commons"/"Apache 2")
[Version of post is ?VirusAnalysis.md
Move ClassSys above ClassSha2 · SwuduSusuwu/SubStack@7f38bd0 ] For the most new sources (+ static libs), use apps such as iSH (for iOS) or Termux (for Android OS) to run this:git clone https://github.com/SwuduSusuwu/SubStack.git && cd ./Substack/ && ./build
less
cxx/Macros.hxxless
cxx/ClassPortableExecutable.hxxless
cxx/ClassSys.hxxless
cxx/ClassSys.cxxless
cxx/ClassSha2.hxxless
cxx/ClassSha2.cxxless
cxx/ClassResultList.hxxless
cxx/ClassCns.hxxless
cxx/ClassCns.cxxless
cxx/VirusAnalysis.hxxless
cxx/VirusAnalysis.cxxless
cxx/main.cxx / with boilerplate /To run most of this fast (lag less,) use
CXXFLAGS
which auto-vectorizes/auto-parallelizes, and to setup CNS synapses (Cns::setupSynapses()
) fast, use TensorFlow'sMapReduce
. Resources: How to have computers process fast.For comparison;
produceVirusFixCns
is close to assistants (such as "ChatGPT 4.0" or "Claude-3 Opus";) have such demo asproduceAssistantCns
;less
cxx/AssistantCns.hxxless
cxx/AssistantCns.cxx=================================================
Hash resources: Is just a checksum (such as sha-2) of all sample inputs, which maps to "this passes" (or "this does not pass".) https://wikipedia.org/wiki/Sha-2
Signature resources: Is just a substring (or regex) of infections, which the virus analysis tool checks all executables for; if the signature is found in the executable, do not allow to launch, otherwise launch this. https://wikipedia.org/wiki/Regex
Static analysis resources: https://github.com/topics/analysis has lots of open source (FLOSS) analysis tools (such as https://github.com/kylefarris/clamscan, which wraps https://github.com/Cisco-Talos/clamav/ ,) which show how to use hex dumps (or disassembled sources) of the apps/SW (executables) to deduce what the apps/SW do to your OS. Static analysis (such as Clang/LLVM has) just checks programs for accidental security threats (such as buffer overruns/underruns, or null-pointer-dereferences,) but could act as a basis, if you add a few extra checks for deliberate vulnerabilities/signs of infection (these are heuristics, so the user should have a choice to quarantine and submit for review, or continue launch of this). https://github.com/llvm/llvm-project/blob/main/clang/lib/StaticAnalyzer is part of Clang/LLVM (license is FLOSS,) does static analysis (emulation produces inputs to functions, formulas analyze stacktraces (+ heap/stack uses) to produce lists of possible unwanted side effects to warn you of); versus
-fsanitize
, do not have to recompile to do static analysis.-fsanitize
requires you to produce inputs, static analysis does this for you. LLVM is lots of files, Phasar is just it’s static analysis: https://github.com/secure-software-engineering/phasarExample outputs (tests “Fdroid.apk”) from VirusTotal, of static analysis + 2 sandboxes; the false positive outputs (from VirusTotal's Zenbox) show the purpose of manual review.
Sandbox resources: As opposed to static analysis of the executables hex (or disassembled sources,) sandboxes perform chroot + functional analysis. https://wikipedia.org/wiki/Valgrind is just meant to locate accidental security vulnerabilities, but is a common example of functional analysis. If compliant to POSIX (each Linux OS is), tools can use:
chroot()
(runman chroot
for instructions) so that the programs you test cannot alter stuff out of the test; plus can usestrace()
(runman strace
for instructions, or look at https://opensource.com/article/19/10/strace https://www.geeksforgeeks.org/strace-command-in-linux-with-examples/ ) which hooks all system calls and saves logs for functional analysis. Simple sandboxes just launch programs with "chroot()"+"strace()" for a few seconds, with all outputs sent for manual reviews; if more complex, has heuristics to guess what is important (in case of lots of submissions, so manual reviews have less to do.)Autonomous sandboxes (such as Virustotal's) use full outputs from all analyses, with calculus to guess if the app/SW is cool to us (thousands of rules such as "Should not alter files of other programs unless prompted to through OS dialogs", "Should not perform network access unless prompted to from you", "Should not perform actions leading to obfuscation which could hinder analysis",) which, if violated, add to the executables "danger score" (which the analysis results page shows you.)
CNS resources: Once the virus analysis tool has static+functional analysis (+ sandbox,) the next logical move is to do artificial CNS. Just as (if humans grew trillions of neurons plus thousands of layers of cortices) one of us could parse all databases of infections (plus samples of fresh apps/SW) to setup our synapses to parse hex dumps of apps/SW (to allow us to revert all infections to fresh apps/SW, or if the whole thing is an infection just block,) so too could artificial CNS (with trillions of artificial neurons) do this: For analysis, pass training inputs mapped to outputs (infection -> block, fresh apps/SW -> pass) to artificial CNS; To undo infections (to restore to fresh apps/SW,) inputs = samples of all (infections or fresh apps/SW,) outputs = EOF/null (if is infection that can not revert to fresh apps/SW,) or else outputs = fresh apps/SW; To setup synapses, must have access to huge sample databases (such as Virustotal's access.)
Github has lots of FLOSS (Open Source Softwares) simulators of CNS at https://github.com/topics/artificial-neural-network which have uses to do assistants (such as "ChatGPT 4.0" or "Claude-3 Opus",) but not close to complex enough to house human consciousness:
"HSOM" ( https://github.com/CarsonScott/HSOM , license is FLOSS ) is a simple Python neural map.
"apxr_run" ( https://github.com/Rober-t/apxr_run/ , license is FLOSS ) is almost complex enough to house human consciousness; "apxr_run" has various FLOSS neural network activation functions (absolute, average, standard deviation, sqrt, sin, tanh, log, sigmoid, cos), plus sensor functions (vector difference, quadratic, multiquadric, saturation [+D-zone], gaussian, cartesian/planar/polar distances): https://github.com/Rober-t/apxr_run/blob/master/src/lib/functions.erl Various FLOSS neuroplastic functions (self-modulation, Hebbian function, Oja's function): https://github.com/Rober-t/apxr_run/blob/master/src/lib/plasticity.erl Various FLOSS neural network input aggregator functions (dot products, product of differences, mult products): https://github.com/Rober-t/apxr_run/blob/master/src/agent_mgr/signal_aggregator.erl Various simulated-annealing functions for artificial neural networks (dynamic [+ random], active [+ random], current [+ random], all [+ random]): https://github.com/Rober-t/apxr_run/blob/master/src/lib/tuning_selection.erl Choices to evolve connections through Darwinian or Lamarkian formulas: https://github.com/Rober-t/apxr_run/blob/master/src/agent_mgr/neuron.erl
Simple to convert Erlang functions to Java/C++ (to reuse for fast programs; the syntax is close to Lisp's.
Examples of howto setup APXR as artificial CNS; https://github.com/Rober-t/apxr_run/blob/master/src/examples/ Examples of howto setup HSOM as artificial CNS; https://github.com/CarsonScott/HSOM/tree/master/examples Simple to setup once you have access to databases.
Alternative CNS: https://swudususuwu.substack.com/p/albatross-performs-lots-of-neural
This post was about general methods to produce virus analysis tools, does not require that local resources do all of this;
=================================================
How to reproduce the problem
Scan new executables (that are not part of stock databases)