PsRamFS as LittleFS cache to accelerate a web server - viable?

mhaberler commented 1 year ago

Hi,

this looks like a very interesting project.

I have a performance issue with a web server using ESPAsyncWebServer serving from a small LittleFS root (maybe 1MB) - which is so slow that reading large files triggers the watchdog no matter how long I set it

so I was thinking about caching the LittleFS content in a PSRAM-based filesystem

Do you think this project is a viable base for such an effort?

what I would do is:

after boot, recursively copy the LittleFS contents to the PsRamFS (probably in a background task)
when serving a file, first check if the file is available in PsRamFS, and only fall back to LittleFS

another option would be to cook up something like a PSRAM-based LittleFS subclass which would just suck up the whole partition into PSRAM and serve from there, but I have no grasp how complex this would pan out.

Does this sound reasonable? which option would you choose?

I would be grateful for a frank opinion.

best regards Michael

tobozo commented 1 year ago

hi, thanks for the feedback :+1:

Copying big files from LittleFS to PSRam could work if you don't create folders: the vfs implementation for folders is incomplete and may trigger bugs, but it should be fine if you only create files at the root of psramfs filesystem.

Another approach would be to maintain your own file struct and use RomDiskStream with ESPAsyncWebServer, preload your files at boot and even unmount LittleFs.

#include <LittleFS.h>
#include <PSRamFS.h>
#include <map>

void toPsram( fs::FS &sourceFS, const char* path );

struct PsRamFile_t
{
  char* data;
  size_t size;
  RomDiskStream getStream() { return RomDiskStream((const uint8_t*)data, size); }
};

std::map<String,PsRamFile_t> MyPsramFiles; // global psram-files storage

void toPsram( fs::FS &sourceFS, const char* path )
{
  fs::File myFile = sourceFS.open(path);
  if (!myFile ) return;

  auto data_mem = ((char*)ps_malloc(myFile.size()+1));

  if( data_mem == NULL ) { // malloc failed !
    myFile.close();
    return;
  }

  PsRamFile_t myPsRamFile = { .data=data_mem, .size=myFile.size() };

  size_t bytes_read = myFile.readBytes( myPsRamFile.data, myFile.size() );

  if( myFile.size() != bytes_read ) {
    // incomplete copy ?
    Serial.printf("[WARNING] File copy missed %d bytes for %s\n", myFile.size()-bytes_read, path );
  }

  myFile.close();

  MyPsramFiles[path] = myPsRamFile; // add file to global psram-files array
}


    // somewhere in the ESPAsyncWebServer event loop

    String myFilePath = "/path/to/myFile.txt";

    if( MyPsramFiles[myFilePath] != NULL ) {
        server.streamFile(MyPsramFiles[myFilePath].getStream(), "text/plain");
    }

mhaberler commented 1 year ago

That is a very elegant solution, thanks!

I will give this a try and report back.

Appreciated! Michael

mhaberler commented 1 year ago

just noting that I looked into https://github.com/PaulStoffregen/LittleFS/blob/main/src/LittleFS.h#L362-L445 with the angle of instantiating this with the partition name, and have it read the partition into PSRAM at begin() time

unfortunately it is not very general code and pretty much targeted to the Teensy hardware

your solution looks far more straightforward

tobozo commented 1 year ago

thanks for that link, very interesting :+1:

Although it's POSIX compliant, the teensy core doesn't seem to implement a RTOS vfs layer (probably doesn't need to), which esp32 relies upon, this may explain why their LittleFS code can't be used as is.

But I like the idea of inheriting LittleFS instead of just FS and I'll research that angle as soon as I have some time.

mhaberler commented 1 year ago

well that could be something rather useful

I've been exploring the RAM disk / caching options for slow flash-based filesystems and it is pretty barren land, despite the obvious reasons to have something like that

I really wonder how people run ESPAsyncWebServer in stable fashion given say LittleFS static file serving - solution "disable the watchdog and hope for the best" is so-so

especially Chrome is super-aggressive in parallel loading of js/html/css etc on startup, but so far I cannot get past Chrome just yet :-)

I dislike the solution of storing files in C arrays and serving those, that looks pretty inflexible to me for say deploying a small fix in a web app

bundling the web app is an option to reduce parallel initial loads, but not so cool during development

mhaberler commented 1 year ago

well, we have a result - at least up to cache loading: https://github.com/mhaberler/esp32-arduino-playground/blob/main/src/TreeWalker-test/main.cpp https://github.com/mhaberler/esp32-arduino-playground/blob/main/src/TreeWalker-test/TreeWalker.hpp

pre: free PSRAM=4192123

d       0 /www                                               2023-05-09 18:54:21
d       0 /www/classes                                       2023-05-08 21:33:43
d       0 /www/classes/3d                                    2023-05-08 21:33:43
f    2108 /www/classes/3d/Shape3D.js.gz                      2023-05-08 21:33:43
f    1590 /www/classes/3d/World.js.gz                        2023-05-08 21:33:43
....
f     890 /www/utils/view/paint.js.gz                        2023-05-08 21:33:43
f     980 /www/utils/view/updateView.js.gz                   2023-05-08 21:33:43

post: free PSRAM=3193923 used=998200
52 files, 997239 bytes cached in 5.354 S - 181.895 kB/s

thanks!

next step: implant into my application's webserver

tobozo commented 1 year ago

Beware of isDirectory() though as it doesn't have a consistent behaviour across filesystems.

https://github.com/espressif/arduino-esp32/issues/3130

mhaberler commented 1 year ago

interesting, thanks I'll keep an eye on it

lbernstone commented 10 months ago

Just as another possibility, for fixed files like this, I think it is better to put them into your firmware as constants. You can script something with xxd and a parser (perl, sed, awk) to build the C-style array needed and stuff them into a header file to just serve them as strings. Then you also have these files included in your firmware for easier distribution. It is still dependent on the flash speed, but it removes a couple layers of abstraction. I don't have problems timing out on fairly large javascripts with that method. https://github.com/lbernstone/rrdtool_ESP32/blob/embedded_script/examples/basicDB/javascriptrrd_wlibs-min_js_gz.h

tobozo / ESP32-PsRamFS

PsRamFS as LittleFS cache to accelerate a web server - viable? #11