apache / drill

Apache Drill is a distributed MPP query layer for self describing data
https://drill.apache.org/
Apache License 2.0
1.92k stars 985 forks source link

Query results caching #2881

Closed RayetSena closed 2 months ago

RayetSena commented 4 months ago

Hello,

I am using http storage plugin and I want to activate the caching. In the documentation it says that this requires adding cacheResults. So I added this to my config. But when I run the same query again, there is almost no difference in query duration. You can see the configuration and query duration below.

image

{
  "type": "http",
  "cacheResults": true,
  "connections": {
    "opendata": {
      "url": "https://opendata.cbs.nl/ODataApi/odata/",
      "requireTail": true,
      "method": "GET",
      "authType": "none",
      "inputType": "json",
      "xmlDataLevel": 1,
      "postParameterLocation": "QUERY_STRING",
      "verifySSLCert": false
    }
  },
  "timeout": 4000,
  "retryDelay": 5000,
  "proxyType": "direct",
  "authMode": "USER_TRANSLATION",
  "enabled": true
}

image

Also I tried to configure the caching properties in the drill-override.conf file. Here is the configuration that I added:

drill.exec: {
    caching: {
    enabled: true
    storage:{
           type: “local”
           path:  “cache/directory”
        }
    }
}

Drill version 1.21.1

RayetSena commented 3 months ago

When I test using an older version(1.20.1), I can see the query result files under the tmp/http-cache directory, but when I test using a newer version(1.21.1), there are no query result files in that directory, it only contains the journal.

jnturton commented 2 months ago

Promoted to DRILL-8487.