salsa-rs / salsa

A generic framework for on-demand, incrementalized computation. Inspired by adapton, glimmer, and rustc's query system.
https://salsa-rs.netlify.app/
Apache License 2.0
2.14k stars 152 forks source link

LRU ("garbage collection") for interned values #597

Open nikomatsakis opened 1 month ago

nikomatsakis commented 1 month ago

As described in this hackmd, we would like to support LRU-style collection for interned values so that we can cap maximum memory consumption.

nikomatsakis commented 1 month ago

Mentoring instructions

Here are some of the steps required

MichaReiser commented 1 month ago

@ibraheemdev are you working on this or parts of it?

ShoyuVanilla commented 1 month ago

This seems interesting. I'd like to work on this unless other people is working on this

ShoyuVanilla commented 1 month ago

BTW, I've read some recent papers about cache eviction algorithms saying that they are more efficient than LRU

Though GC might not happen so many times in salsa's real world use cases and the efficient varies by using pattern, I think that it might be worthwhile to implement multiple eviction policies and open them as runtime options because they implementations are quite simple

ibraheemdev commented 1 month ago

I have been working on the first part of this described in the hackmd.

xiyuzhai commented 13 hours ago

I'm wondering how necessary is it. If we don't need to throw away any interned values, we can simply store them in a static pool. Not even needing a salsa db.

I'm thinking that immortal interned values are inherently independent from salsa's incremental computation capabilities.

Maybe a separate crate for these things. There are definitely scenarios where we don't care about the memory consumed by a certain class of interned values. And salsa's gc interned value might deserve a different name.

Separating these things shall make salsa design much easier, I think.

With that being said, it probably has to do with my own implementation of things. For my own programming language, I split types into ethereal types and fluffy types. Fluffy types are those that changes rapidly, for instance, those with some local lifetimes. But ethereal types are those like 'Vec' that are just stable. For my own case, I store fluffy types in a local arena tied to a specific region of code, which gc'ed automatically with the tracked function. And anything related to ethereal types are immortal, interned. However, if one use TypeId for both cases, then maybe one could worry about memory consumption issues.