mathieubossaert / central2pg

Some PostgreSQL functions to retrieve data from an ODK Central server to PostgreSQL tables.
GNU General Public License v3.0
22 stars 9 forks source link

will the edited form in odk central get updated in pg #14

Open stephexcel2004 opened 2 years ago

stephexcel2004 commented 2 years ago

Would like to know whether the script also check the latest edited submission and update the old submission in pg

mathieubossaert commented 2 years ago

No as we do not edit submission into central, we do not manage it. But I think it could be quite easy adapting the "ON CONFLICT" clause here : https://github.com/mathieubossaert/central2pg/blob/a39e7d44255b5e67471188a5ca1d03ebba32233c/insert_into_from_refcursor.sql#L63 Will take a look at it. If you try something let me know if it works as expected.

mathieubossaert commented 2 years ago

To complete my first response : we use PostgreSQL's UPSERT (with unique indexes _xx_id attributes) to catch and manage conflicts on inserts when data already exits in the database. And for the moment we manage such cases with "DO NOTHING" action.

PhilKnight commented 2 years ago

I would also really like to have this feature. I took a look at it, it is more complex than my first attempt which did not work: VALUES ' || _sql_val ||' ON CONFLICT (data_id) DO UPDATE SET' || _sql_col || '= EXCLUDED.' || _sql_col || '; This is because _sql_val and _sql_col are comma separated lists not single values. I also wondered whether it was necessary to update here also: https://github.com/mathieubossaert/central2pg/blob/a39e7d44255b5e67471188a5ca1d03ebba32233c/get_submission_from_central.sql#L63

mathieubossaert commented 2 years ago

Glad to see users of my work from all around the world:-) I you

@PhilKnight you are right, we'll have to build the update statement : SET column_1 = excluded.column_1, column_2 = excluded.column_2

The second update you mentioned is also needed but easy to build in a static string (SET form_data = excluded.form_data)

One big issue to investigate is how to only update submissions that have been truly edited. Each time we invoke central2pg we get all submissions from central (I hope to be able to filter it one day : https://github.com/getodk/central-backend/issues/391 ) and we don't want to update all the data, only the edited one.

I want to make this possible in order to avoid consuming resources for data that I already have for months in my database.

I wish I had some time this summer to explore it but I couldn't. I started working on API calls using the expand=* parameter and submissionDate filtering but didn't get very far.

mathieubossaert commented 1 year ago

I will have another strategy (not trying to filter subtables nor explore expanded json) that consists in :

It is currently possible to fetch a subtable for a single Submission:/ v1/projects/{{projectId}}/forms/{{xmlFormId}}.svc/Submissions('{{instanceId}}')/{{repeatName}}

Will try to do it before the end of the year to be ready for next field season.