goat-community / goat

This is the home of Geo Open Accessibility Tool (GOAT)
GNU General Public License v3.0
89 stars 47 forks source link

Write function to recreate all database functions and triggers #1368

Closed EPajares closed 1 year ago

EPajares commented 1 year ago

Goal of this issue

We were considering using alembic for managing the functions and triggers. Though this led to a couple of problems when maintaining and developing the functions. It would be better to move back to the original approach in which we dropped all functions and recreated them from scratch. This might be even easier now as we are having all functions in the schema public.

Resources

The functions are here: https://github.com/goat-community/goat/tree/main/app/api/src/db/sql I think the following procedure could work:

  1. Drop all functions in schema basic.
  2. We would have to read through the folder and create the functions one by one using the db-connections. There is one thing to be considered. Some functions are relying on each other. Meaning that cannot be created if the dependency functions do not exist yet.

I would also say that the new function recreating the PostgreSQL functions should go here: https://github.com/goat-community/goat/blob/main/app/api/src/db/sql/init_sql.py

This is the function that was previously called by alembic.

Deliverables

A python function that recreates all postgresql-functions and triggers.

metemaddar commented 1 year ago

@EPajares Should it patch to dev branch?

EPajares commented 1 year ago

Yes please to dev

metemaddar commented 1 year ago

We have some function files with same name as db functions, Also have some files witch their function is not in db and some new functions needed to add. We need to classify the new functions. And decide about removed functions.

helper/

new functions (not in files but in db):

We should classify these functions.

not in db but in files:

Handling dependencies

For handling dependencies, I think we can add priorities to the file names. For example 0010 to 0500 for the file names. So if we had new functions with higher priority we can add with changing the lower value part (left side) and if we had new function with lower priority we can add to the end by adding to the right side (for example 0211).

Then we need to change the rewrite function to add according to priority. At the moment we have 50 functions.

Wipe db functions

We need to add a function to remove all db functions

metemaddar commented 1 year ago

We also can add the functions in a python list to set the priority. Then the function can find the file and apply to database.

metemaddar commented 1 year ago

A more simple way is to use try/except to add functions. Loop over them till they completely get empty. :x: But this can add much errors to database log. Also it can run into infinite loop when a dependency is not in the files.

EPajares commented 1 year ago

@metemaddar This is an excellent overview. Some of the missing functions are triggers. You can find them here: https://github.com/goat-community/goat/tree/dev/app/api/src/db/sql/triggers

I forgot to mention that also these we have to recreate. Concerning the other ones that are in DB but not in the codebase, I will review each of them. Though most of them we don't need anymore. They are mostly some testing functions that we forgot to delete.

metemaddar commented 1 year ago

@EPajares Let me recreate the list. I will edit the previous comment.

handling priorites

We also can write a function to read the file first and check if one of other unapplied functions is in the file. This can let us get rid of handling priorities with list or numbers.

metemaddar commented 1 year ago

@EPajares By including the triggers/ directory it had significant change, so added the report bellow:

helper/

scenario/

triggers/

new functions (needs classification):

not in db but in files:

metemaddar commented 1 year ago

We also can write a function to read the file first and check if one of other unapplied functions is in the file. This can let us get rid of handling priorities with list or numbers.

With this it ran into a circular dependency because of name similarity:

heatmap_accessibility_population : {'heatmap_local_accessibility', 'heatmap_population'}
heatmap_local_accessibility : {'prepare_heatmap_local_accessibility'}
prepare_heatmap_local_accessibility : {'heatmap_local_accessibility', 'modified_pois', 'poi_categories', 'select_customization', 'prepare_heatmap_local_accessibility', 'poi_aoi_visualization'}
metemaddar commented 1 year ago

I've updated the dependency checker. And the dependencies are more clear now.

prepare_heatmap_population : {'heatmap_population'}
heatmap_population : {'prepare_heatmap_population'}
heatmap_local_accessibility : {'prepare_heatmap_local_accessibility'}
prepare_heatmap_local_accessibility : {'heatmap_local_accessibility'}

:heavy_check_mark: Fixed by prepending a dot to the checking function name, as every function is coming from a specific schema. eg: basic.heatmap_local_accessibility() which has a dot between basic and function name.

EPajares commented 1 year ago

Thanks this is great!

metemaddar commented 1 year ago

After adding report to init_sql, some new differences was found:

# in db but not in files:
-  count_public_transport_services_station

## in files but not in db:
-  count_public_transport_station_frequency.sql

:heavy_check_mark: fixed by rename the file name.