Closed rafatower closed 5 years ago
Hey!
I was thinking about how to implement a better Cache-Control
strategy in Windshaft-CartoDB
and I realized when there are foreign tables involved in the layer's query I will need to know if one of the affected tables is a foreign table or not. In fine with the approach of returning a NOW()
instead of failing but also I will need to know if the table is a remote table or a local table when we call to CDB_QueryTables_Updated_At
method. If I'm not wrong, the following should do the trick:
$ git diff
diff --git a/scripts-available/CDB_ForeignTable.sql b/scripts-available/CDB_ForeignTable.sql
index 98d3c9b..8311f88 100644
--- a/scripts-available/CDB_ForeignTable.sql
+++ b/scripts-available/CDB_ForeignTable.sql
@@ -171,7 +171,7 @@ AS $$
LEFT JOIN pg_catalog.pg_namespace n ON c.relnamespace = n.oid
WHERE c.oid = query_tables_oid.reloid
)
- SELECT fqtn.dbname, fqtn.schema_name, fqtn.table_name,
+ SELECT fqtn.dbname, fqtn.schema_name, fqtn.table_name, fqtn.relkind,
(CASE WHEN relkind = 'f' THEN @extschema@.CDB_Get_Foreign_Updated_At(reloid)
ELSE (SELECT md.updated_at FROM @extschema@.CDB_TableMetadata md WHERE md.tabname = reloid)
END) AS updated_at
Having such info I can decide what max_age
directive in Cache-Control
header is the best to apply depending on the case.
Thoughts?
And the following are the changes in Maps API:
$ git diff
diff --git a/lib/cartodb/api/middlewares/cache-control-header.js b/lib/cartodb/api/middlewares/cache-control-header.js
index d72ee0c..1d13e20 100644
--- a/lib/cartodb/api/middlewares/cache-control-header.js
+++ b/lib/cartodb/api/middlewares/cache-control-header.js
@@ -1,6 +1,7 @@
'use strict';
const ONE_YEAR_IN_SECONDS = 60 * 60 * 24 * 365;
+const FIVE_MINUTES_IN_SECONDS = 60 * 5;
module.exports = function setCacheControlHeader ({ ttl = ONE_YEAR_IN_SECONDS, revalidate = false } = {}) {
return function setCacheControlHeaderMiddleware (req, res, next) {
@@ -8,14 +9,27 @@ module.exports = function setCacheControlHeader ({ ttl = ONE_YEAR_IN_SECONDS, re
return next();
}
- const directives = [ 'public', `max-age=${ttl}` ];
+ const { mapConfigProvider } = res.locals;
- if (revalidate) {
- directives.push('must-revalidate');
- }
+ mapConfigProvider.getAffectedTables((err, affectedTables) => {
+ if (err) {
+ global.logger.warn('ERROR generating Cache Control Header:', err);
+ return next();
+ }
+
+ if (affectedTables && !affectedTables.tables.every(table => table.relkind !== 'f')) {
+ ttl = FIVE_MINUTES_IN_SECONDS;
+ }
+
+ const directives = [ 'public', `max-age=${ttl}` ];
+
+ if (revalidate) {
+ directives.push('must-revalidate');
+ }
- res.set('Cache-Control', directives.join(','));
+ res.set('Cache-Control', directives.join(','));
- next();
+ next();
+ });
};
};
I'd prefer returning a magic value instead of now() when the table isn't present and use that to signal a lower cache time.
That way it would work in the future if we decide to use cache and we can even enable it in the local tables so it's also used when they aren't cartodbfyied.
That way it would work in the future if we decide to use cache and we can even enable it in the local tables so it's also used when they aren't cartodbfyied.
You mean remote tables
, don't you?. If so, I agree on not exposing the type of table so the day we can know when a remote table was updated we won't need to change anything in Windshaft-CartoDB
.
You mean remote tables, don't you?. If so, I agree on not exposing the type of table so the day we can know when a remote table was updated we won't need to change anything in Windshaft-CartoDB.
Yes, but also local tables that aren't cartodbfyied. The current function returns NULL on those cases, so if you have a JOIN of a cartodbfied table and a non-cartodbfied one (no matter if local or remote) you just get the last update time of the first.
If instead of returning NOW() or NULL we returned a huge epoch into the future when a table isn't in the tablemetadata table (local or remote), we could detect that in Windshaft and set a low/5 minutes cache to avoid "my map is not refreshing" complains.
Yes, but also local tables that aren't cartodbfyied. The current function returns NULL on those cases
then probably when I saw some Last-updated
header with the epoch zero was because of a nasty conversion of NULL to a timestamp somewhere in the chain. Mind the interactions with varnish/cdn.
cc/ @jgoizueta to coordinate release and deploys
This may happen with non-carto DB's, when checking the updated_at times and not finding the corresponding remote.cdb_tablemetadata imported from the foreign non-carto DB.
Instead of failing, return a NOW() timestampt, so that caching logic just assumes there may have been changes.
This makes it work today, and leaves open the possibility of adding the required carto metadata for homogeneous caching in the future.
With this I can use the Maps API against a non-carto foreign table, without needing any other patches.