Open john-gom opened 1 year ago
POC repo is https://github.com/john-gom/openfoodfacts-data
Using NestJS as a general framework with Mikro-ORM for data modelling / migrations and Postgraphile for GraphQL support
This issue has been open 90 days with no activity. Can you give it a little love by linking it to a parent issue, adding relevant labels and projets, creating a mockup if applicable, adding code pointers from https://github.com/openfoodfacts/openfoodfacts-server/blob/main/.github/labeler.yml, giving it a priority, editing the original issue to have a more comprehensive description… Thank you very much for your contribution to 🍊 Open Food Facts
I've created a script to load all product "sto" files into Postgres. Branch is issues/8620-a.
Some products contain \u0000 in the data which is not compatible with postgres. SQL to fix was:
update revision set data = replace(data::text,'\u0000','')::json
where code in ('04810513','3770001905075','4779030380333','4840237001946','6909995101119','7501058623256','7702024040040','7798305866775','7801620005191','7895000467013','7898142862043','8015057004453','8412600017975','9300617296614','9557789820127')
and data::text like '%\\u0000%';
An example database has been uploaded here: https://static.openfoodfacts.org/data/pg/products.dmp
This can be restored using pg_restore
That's a very interesting proposal.
Although I still haven't understood how OFF is organized, I definitely have the feeling that a regular relational model could bring many benefits like :
From what I heard during the march 2024 hackaton, there are some recursive relations within the data -- but that's not a hindrance : most database management systems support Common Table Expressions, which is the SQL way for expressing queries on recursive data.
So thanks for your work, I'm eager to look at your postgres data.
Problem
Currently the OFF data is in a lot of different places (taxonomy files, MongoDB, STO files) which makes it difficult to perform queries across the data sets.
Aggregated queries against MongoDB are also very slow and the author feels these would be considerably faster against a relational model
Proposed solution
Move to a relational model
Part of
5527