openfoodfacts / robotoff

🤖 Real-time and batch prediction service for Open Food Facts
https://openfoodfacts.github.io/robotoff/
GNU Affero General Public License v3.0
80 stars 56 forks source link

Copy prod data to stagging (aka .net) #894

Open alexgarel opened 2 years ago

alexgarel commented 2 years ago

Problem

On production we have a lot of data but very few in stagging, because there are very few product addition. For productopener we copy data from production (in mongodb).

Copying data from production to stagging is not that easy because the server_domain column has to be updated which takes a lot of time.

Proposed solution

I'm not sure what is the best option, but I would say:

raphael0202 commented 1 year ago

As discussed with @alexgarel, it would be more interesting to completely drop the server_domain field, as we have distinct environments for staging/production. This way we can import production data into staging without having to do any DB migration. As we're still considering adding support of OpenBeautyFacts/OpenProductFacts/... to Robotoff, we're keeping the server_type field.

What needs to be done:

raphael0202 commented 1 year ago

Fixed by #1083.

What has been done:

Full docker command ES update by query:

docker exec -it robotoff_elasticsearch_1 curl -X POST "http://elastic:$ELASTIC_PASSWORD@localhost:9200/logo/_update_by_query?conflicts=proceed&pretty" -H 'Content-Type: application/json' -d'{"query": {"bool": {"must_not": {"exists": {"field": "server_type"}}}}, "script": {"inline": "ctx._source.server_type = \"off\"", "lang": "painless"}}'
raphael0202 commented 1 year ago

Well I missed the fact this issue was not really about multi-platform support on Robotoff, so I'm reopening it.