Open Stebalien opened 6 years ago
Note: we'll have to be very careful with this. This is the kind of optimization that could end up hurting performance (or even memory usage) more than it helps if we're not careful. It may not even be worth it in practice.
Since CID are fairly small my instincts are telling me that this will be more trouble then it's worth and is unlikely to improve memory enough to be worth it.
You're probably right. We probably do store duplicate CIDs at points however most of these are likely ephemeral. All of our caches tend to work with blocks from disk so they'll all use the same CID from the same memory location.
(but it shouldn't be difficult to investigate this)
Now that we have nice, string-backed CIDs, we should consider caching them in an LRU cache. My hypothesis is that, when working with CIDs, we likely regenerate them several times. For example, with bitswap, we'll re-create the CIDs when we receive the blocks we're looking for. When we do this, we may end up storing each CID in memory twice.
Note: we'll have to be very careful with this. This is the kind of optimization that could end up hurting performance (or even memory usage) more than it helps if we're not careful. It may not even be worth it in practice.