diffblue / cbmc

C Bounded Model Checker
https://diffblue.github.io/cbmc
Other
847 stars 262 forks source link

Memory consumption during pointer analysis #225

Closed martin-cs closed 3 years ago

martin-cs commented 8 years ago

So here's a bug I never thought I'd have to submit BUT a major industrial partner has a problem with memory consumption in static analysis.

Imagine you have a program with, say, 27,500 global variables and thus __CPROVER_initialise has about 27,500 instructions which means 378,125,000 entries in the abstract domain maps, just for this function. Which is 8Gb at 24 bytes per entry (given map overhead this is an underestimate).

Clearly the solution is to do some kind of structural sharing between them. As this could be quite useful in other parts of the code I thought it worth a little discussion before starting. Design questions:

  1. I think the right answer is to keep the map interface and then implement this as a tree / hash-trie like structure.
  2. Should the nodes in this trie be irept's? The advantage would be that the code is already written and working, i/c reference counting, copy-on-write and deduplification (merge_irept). The disadvantage is that it's a bit of an abuse of the structure and it's not ideally keeping arbitrary abstract domain elements in them (or perhaps these should be irepts as well).
  3. Copy-on-write or some kind of hashing / de-duplicating structure? I'm leaning towards copy-on-write first as that should drop us down to O(instructions * difference * log(variables)) where differences will be 1 for many case of interest. We can then add de-duplication later to re-discover lost sharing if needed.

Suggestions welcome.

@kroening @peterschrammel @tautschnig

thomasspriggs commented 3 years ago

Hi Martin. Did you ever resolve your memory usage issue? Was there ever a collective decision on how to resolve sharing issues like this? Do you think this github issue is still useful for discussion or are you happy for it to be closed out due to it no longer being relevant / useful?

thomasspriggs commented 3 years ago

I am going to close this out, due to the length of time since there was useful discussion on this issue and due to the lack of response from the Martin. Please feel free to re-open if you think this is still relevant.

martin-cs commented 3 years ago

@thomasspriggs : sorry I missed this.

Yes; this was resolved, in a way. sharing_mapt allows us to share (CoW) objects within abstract domains. VSD makes very heavy use of this. The ai_storaget interface also allows sharing of domains between locations, although this is not implemented yet.

So, problem solved and yes, this should be closed!