openfoodfacts / openfoodfacts-query

This extracts key product data from MongoDB into a Postgres database to speed up OFF
GNU Affero General Public License v3.0
0 stars 0 forks source link

Switch from json to jsonb #49

Open cquest opened 1 month ago

cquest commented 1 month ago

jsonb has many benefits :

"In general, most applications should prefer to store JSON data as jsonb, unless there are quite specialized needs, such as legacy assumptions about ordering of object keys." - postgresql's documentation

john-gom commented 1 month ago

Hi @cquest . I did initially try using JSONB but there were two main factors that made me stick with JSON:

  1. The initial storage time is increased as Postgres needs to parse the supplied JSON
  2. Queries performed by off-query do not use the JSON data, they use the relational model, so having the pre-parsed JSONB has no benefit
alexgarel commented 2 weeks ago

@john-gom reduce storage size can be considered enough of a benefice for the time spent parsing.

john-gom commented 2 weeks ago

Hi @alexgarel . I am working on a branch on this (although paused for Keycloak) so that the full Product JSON is only loaded into a temporary table for each batch of products being imported, i.e. not keep the original JSON at all. This should significantly reduce the database size