ngxson / wllama

WebAssembly binding for llama.cpp - Enabling in-browser LLM inference
https://huggingface.co/spaces/ngxson/wllama
MIT License
371 stars 18 forks source link

[Feature Request] Allow setting our own Cache Manager #108

Closed felladrin closed 1 month ago

felladrin commented 1 month ago

I'd love to be able to set up my own Cache Manager for cases I need to customize it.

As the Cache Manager has already a signature settled, it would be good if we could pass our own implementation during Wllama initialization.

So we'd could add a cacheManager in WllamaConfig:

https://github.com/ngxson/wllama/blob/667dd9192540ae15a806ef8b17d3fc1728018e4d/src/wllama.ts#L14

cacheManager: {
    /**
     * Convert a given URL into file name in cache.
     *
     * Format of the file name: `${hashSHA1(fullURL)}_${fileName}`
     */
    getNameFromURL(url: string): Promise<string>;
    /**
     * Write a new file to cache. This will overwrite existing file.
     *
     * @param name The file name returned by `getNameFromURL()` or `list()`
     */
    write(name: string, stream: ReadableStream, metadata: CacheEntryMetadata): Promise<void>;
    /**
     * Open a file in cache for reading
     *
     * @param name The file name returned by `getNameFromURL()` or `list()`
     * @returns ReadableStream, or null if file does not exist
     */
    open(name: string): Promise<ReadableStream | null>;
    /**
     * Get the size of a file in stored cache
     *
     * NOTE: in case the download is stopped mid-way (i.e. user close browser tab), the file maybe corrupted, size maybe different from `metadata.originalSize`
     *
     * @param name The file name returned by `getNameFromURL()` or `list()`
     * @returns number of bytes, or -1 if file does not exist
     */
    getSize(name: string): Promise<number>;
    /**
     * Get metadata of a cached file
     */
    getMetadata(name: string): Promise<CacheEntryMetadata | null>;
    /**
     * List all files currently in cache
     */
    list(): Promise<CacheEntry[]>;
    /**
     * Clear all files currently in cache
     */
    clear(): Promise<void>;
    /**
     * Delete a single file in cache
     *
     * @param nameOrURL Can be either an URL or a name returned by `getNameFromURL()` or `list()`
     */
    delete(nameOrURL: string): Promise<void>;
    /**
     * Delete multiple files in cache.
     *
     * @param predicate A predicate like `array.filter(item => boolean)`
     */
    deleteMany(predicate: (e: CacheEntry) => boolean): Promise<void>;
    /**
     * Internally used
     */
    _writeMetadata(name: string, metadata: CacheEntryMetadata): Promise<void>;
};

I see we can overwrite the cacheManager from Wllama instance: https://github.com/ngxson/wllama/blob/667dd9192540ae15a806ef8b17d3fc1728018e4d/src/wllama.ts#L156-L157

But as the cacheManager is not being passed down to MultiDownloads->GGUFRemoteBlob, it ends up that the overwritten cacheManager is never used.


Reasoning: iPads now access the websites by default in Desktop Mode. This means the user agent doesn't show them anymore as mobile devices but as Mac machines instead. This has broken this check: (So models are not able to be loaded from the cache on iPads anymore) https://github.com/ngxson/wllama/blob/667dd9192540ae15a806ef8b17d3fc1728018e4d/src/cache-manager.ts#L347-L355

If we could overwrite the cacheManager, a hotfix could be done without having to wait for a new release of Wllama.

ngxson commented 1 month ago

Yes thanks for the suggestion. I also thought about separating completely 3 interfaces and allow user to swap with their own object:

This may introduce some breaking changes so I may need to release as a major (2.0) version