tailcallhq / tailcall

High Performance GraphQL Runtime
https://tailcall.run
Apache License 2.0
1.28k stars 253 forks source link

feat: allow users to specify the http cache size in terms of total bytes rather than number of entries. #2035

Closed laststylebender14 closed 3 months ago

laststylebender14 commented 4 months ago

Description:

Currently, in our system, there is provision to set the HTTP cache size in terms of the number of entries a cache can hold, which may not be the most efficient approach. I propose allowing users to specify the cache size in terms of total bytes rather than the number of entries.

Presently, users define the cache size as the maximum number of entries it can hold using the @upstream(httpCache: X) directive.

Following example limits the cache to 42 entries.

@upstream(httpCache: 42)

Following example specifies a cache of 2 MB.

@upstream(httpCache: 2097152)

This feature request suggests transitioning to a byte-based specification to provide more granular control over cache size allocation and better accommodate diverse data sizes and resource requirements.

Technical requirements

sirish123 commented 4 months ago

Hey! @ranjitmahadik new to rust but will give this a shot, Here is what I think will work, this is the code snippet as given in the mocha documentation and by the look of it we have to add a weigher function to assess the true size or byte length.

        // Evict based on the byte length of strings in the cache.
            let cache = Cache::builder()

        // A weigher closure takes &K and &V and returns a u32
        // representing the relative size of the entry.

        .weigher(|_key, value: &String| -> u32 {
            value.len().try_into().unwrap_or(u32::MAX)
        })

        // This cache will hold up to 32MiB of values.

        .max_capacity(32 * 1024 * 1024)
        .build();

As you can see below the original code in tailcal, the modification in tailcall is essentially we just have to add the weigher function. After i create HttpCacheManagerByteSize with the weigher function, from there on I dont know what is the best way to invoke it i.e. how does one specifiy that he needs Byte level control in order to invoke HttpCacheManagerByteSize rather than the default HttpCacheManager

impl HttpCacheManager {
    pub fn new(cache_size: u64) -> Self {
        let cache = Cache::builder()
            .eviction_policy(EvictionPolicy::lru())
            .max_capacity(cache_size)
            .build();
        Self { cache: Arc::new(cache) }
    }

    pub async fn clear(&self) -> Result<()> {
        self.cache.invalidate_all();
        self.cache.run_pending_tasks().await;
        Ok(())
    }
}
impl HttpCacheManagerByteSize {
    pub fn new(cache_size: u64) -> Self {
        let cache = Cache::builder()
            .eviction_policy(EvictionPolicy::lru())
             .weigher(|_key, value: &String| -> u32 {
              value.len().try_into().unwrap_or(u32::MAX)
          })
            .max_capacity(cache_size)
            .build();
        Self { cache: Arc::new(cache) }
    }

    pub async fn clear(&self) -> Result<()> {
        self.cache.invalidate_all();
        self.cache.run_pending_tasks().await;
        Ok(())
    }
}
laststylebender14 commented 4 months ago

Hi @sirish123, welcome to the Rust community! I hope you're enjoying it so far. regarding your question, you can specify the total bytes the cache should use with the upstream directive. For more detailed information, you can refer to the upstream documentation.

documentation: https://tailcall.run/docs/directives/upstream/#httpcache

We are transitioning to a byte-based specification for the http cache. To use the newly created total byte-based cache manager, you can directly make changes in the following file.

https://github.com/tailcallhq/tailcall/blob/55fb2e30e5c7151f4cf6cdb4b298888fda3f2534/src/cli/runtime/http.rs#L118

If you have any more questions or need further assistance, feel free to ask.

sirish123 commented 4 months ago

Hey @laststylebender14 I know there is till lot to be done as I am not familiar with unit and data tests( will add upon your guidance ) but could you review the commit #2053 and tell what is wrong and where it could be improved.

tusharmath commented 4 months ago

I think the size should be specified as a percentage of system memory.

sirish123 commented 4 months ago

@tusharmath @laststylebender14 So lets say i give 15% as the percentage to be allocated, then at that point in time should it allocate 15% of the available memory or do we account for the total system memory( i.e. Ram size ) and then allocate 15% of the total Ram.

sirish123 commented 4 months ago

@laststylebender14 @tusharmath shall we keep this on hold?

tusharmath commented 4 months ago

We don't need to allocate upfront. We just need to evict lru items.

sirish123 commented 4 months ago

@tusharmath @laststylebender14 the http_cache_percentage(string) parameter was introduced in the latest commit which allows a user to specify the percentage of memory that should be allocated(i.e max size of cache) and as usual uses the LRU policy in case the max capacity of cache is reached.

sirish123 commented 4 months ago

@laststylebender14 @tusharmath all the comments except one have been addressed and I tried everything I could for making f32 variable in the struct which derives Eq and I dont think its possible. If I remove Eq a lot of code has to be rewritten and it breaks many core features which indeed need Eq, I think the string solution works perfectly though the user just has to add the extra double quotes and further suggestions are welcome!

sirish123 commented 4 months ago

@tusharmath @laststylebender14 all your requested changes have been implemented and all test cases have passed

github-actions[bot] commented 3 months ago

Action required: Issue inactive for 30 days. Status update or closure in 7 days.

github-actions[bot] commented 3 months ago

Issue closed after 7 days of inactivity.