bblanchon / ArduinoJson

📟 JSON library for Arduino and embedded C++. Simple and efficient.
https://arduinojson.org
MIT License
6.7k stars 1.12k forks source link

Filtering large arrays of objects #2072

Closed T-vK closed 5 months ago

T-vK commented 6 months ago

Describe the issue
I am sending requests to Spotify's get-audio-analysis API endpoint that returns huge 300kb+ JSONs.

I only want to get:

meta.timestamp
beats[].start
bars[].start

Environment
Here is the environment that I'm using':

Reproduction
Here is a small snippet that demonstrates the problem.

#include <WiFiClient.h>
#include <HTTPClient.h>
#include <ArduinoJson.h>

// ...

WiFiClient responseBodyStream = http.getStream();

// This works, but is not what I want:
JsonDocument filter;
filter["meta"] = true;
filter["beats"] = true;
filter["bars"] = true;

// This is what I want, but obviously does not work:
JsonDocument filter;
filter["meta"] = true;
filter["beats"][]["start"] = true;
filter["bars"][]["start"] = true;

JsonDocument responseBodyJsonDoc;
DeserializationError error = deserializeJson(responseBodyJsonDoc, responseBodyStream, DeserializationOption::Filter(filter));

serializeJsonPretty(responseBodyJsonDoc, Serial);

// Release resources
responseBodyJsonDoc.clear(); // Clear the JSON document
responseBodyStream.stop(); // Close the WiFiClient
http.end();
Serial.println(ESP.getFreeHeap());

// ...

I would also like to extend the filter to only keep the items of beats/bars that match:

beats[].start >= n
bars[].start >= n

I was not able to find a way to do this. Any ideas?

bblanchon commented 6 months ago

Hi @T-vK,

As you can confirm with the ArduinoJson Assistant, the filter you need is this:

{
  "meta": {
    "timestamp": true
  },
  "beats": [
    {
      "start": true
    }
  ],
  "bars": [
    {
      "start": true
    }
  ]
}

You can create this filter like so:

JsonDocument filter;
filter["meta"]["timestamp"] = true;
filter["beats"][0]["start"] = true;
filter["bars"][0]["start"] = true;

In the last part of your message, you say you want to apply logic operators to filters. Unfortunately, it is not possible at the moment, and I don't see it implemented in the foreseeable future. This request has been made multiple times (#1721, #1486, #1316, #1708), but we have yet to find a suitable solution.

As a workaround, I suggest you try the deserialization in chunks technique.

Best regards, Benoit

T-vK commented 6 months ago

Ohh, I thought [0] would mean that it just applied to the first item of the array. I actually thought there was a bug because filter["beats"][0] = true; still keeps all beats items not just the first. Thanks for the explanation! :pray:

Chunk deserialization seems to be very invasive in the streaming process. I don't really understand how I would combine the usage of filters and chunk deserialization in order to get rid of some items from some arrays. Are there any examples?

bblanchon commented 6 months ago

Indeed, deserialization in chunks requires significant modifications in your program. I do not have an example that precisely matches your use case, but you can find one that combines both techniques in the Reddit case study of Mastering ArduinoJson.

Are you sure you can't work with just the filter? Did you check with the ArduinoJson Assistant how much RAM you need?

T-vK commented 6 months ago

I'll look into it. The issue is that the project I'm working on is already using most of the RAM even without the Spotify integration, so I need to save as much as possible to get a stable result. At the moment songs containing more than n beats will reliably cause my ESP to run out of memory.