NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
49.06k stars 5.65k forks source link

Disassembler: ContextCache is unperformant for multithreaded analysis #6649

Open sad-dev opened 1 week ago

sad-dev commented 1 week ago

Is your feature request related to a problem? Please describe. The code for ContextCache.getWords is shown below:

    private BigInteger lastContextValue;
    private int[] lastContextWords;

    private synchronized int[] getWords(BigInteger value) {
        if (value.equals(lastContextValue)) {
            return lastContextWords;
        }
                ...
        lastContextValue = value;
        lastContextWords = words;
        return words;
    }

As SleighLanguage is mapped one-one with a LanguageID, so is the underlying ContextCache. Thus any Disassembler returned by getDisassembler(Language language,...) will also go through the lock. This severely affects the performance of scripts that run multiple Disassembler instances (or decompiler threads that may also need to disassemble) in parallel.

Describe the solution you'd like I'm deeply skeptical how much utility the getWordscaching brings, but if it must be preserved I would recommend the use of a concurrent collection plus some size limit or an atomic. Otherwise I would simply do away with it altogether.

Describe alternatives you've considered Allow clones of the disassembler/sleighlanguage ? Or add bogus copies of each language ?