PyBackup entrypoint - Githubissues

tdipisa commented 6 years ago

Develop a specific API to send Fulcrum's Records to the PyBackup module. An asynchronous mechanims needs to be included at this stage in order to grant high availability of the endpoint service, and reduce (throttle) downstream requests toward Fulcrum APIs which are needed to access and validate webhooks payloads.

Needed functionalities:

[x] Module API
~~- [ ] Asynchronous mechanism to manage the incoming records and grant high availability~~ (commented out by https://github.com/geosolutions-it/pyfulcrum/issues/3#issuecomment-431051535)

tdipisa commented 6 years ago

deleted @cezio comment by mistake:

Asynchronous mechanism to manage the incoming records and grant high availability

I'd strongly suggest to move this functionality to webhook web application, since it's the one that is responsible for flow control. Adding such dependency to library is a very bad practice.

tdipisa commented 6 years ago

@cezio as requirement we need to have this python module indipendent from the web application. The web application we specified in our proposal is needed only to provide a demonstration of usage of the python module (is not a real part of the required work). In addition the reason why we need to implement an asyncronous mechanims in the python module is to properly manage multiple incoming requests in a context in which each request needs to be validated to be backupped: for each incoming request we need to perform HTTP requests to the Fulcrum APIs for validation purposes, the I/O operations against the Pg DB and so on.

giohappy commented 6 years ago

The goal of the project is the implementation of the backup logic. I proposed to include an asynchronous mechanism to increase the throughput of the app, but it wasn't meant to substitute or implement a distribuited mechanism like with Celery + brokers, or similar. Their initial reference was a simple, basic PHP script!

The main reason I thought about this was to release the webhook request asap, to not incur on protection mechanisms from the Fulcrum platform webhook system.

Thinking twice I agree that probably the core module shouldn't implement the async / queue concerns... I would leave this aside for the moment and eventually consider a mechanism at an upper level.

A key point is not to waste most of the time budget on this and do not invest too much on the web app (no celery, brokers, external servers, etc.)

cezio commented 6 years ago

This comment is related: https://github.com/geosolutions-it/pyfulcrum/issues/8#issuecomment-430595474

cezio commented 6 years ago

Few words on implementation I'm working on:

PyFulcrum will be using SQLAlchemy project with migrations for db storage
To access Api, caller will have to instantiate ApiManager with db session (or db connection) and api key. The latter may be optional later for some uses.
ApiManager will have properties for each resource type (.projects, .forms, etc), with uniform api (at the moment two methods: .get(id, cached), list(cached). This is similar to fulcrum python module, but the key difference is gateway classes for resources will handle API and db-level operations internally returning db objects. Caller will have to specify if it wants update db with live results (cached flag). Internally, if cached is set to False, resource gateway class will fetch data from Fulcrum API, deserialize, transform and save to db, and return db object(s). If cached is set to True (default), results are returned just from db.

Usage example:

from sqlalchemy.engine import make_engine
from pyfulcrum.lib.api import ApiManager
from fulcrum import Fulcrum

fulcrum = Fulcrum(key='super-secret-key')
DB_URL = 'postgresql://user:pwd@host/db'

api = ApiManager(session=make_engine(DB_URL), client=fulcrum)
api.forms.list()
api.forms.get(project_id)
...

At the moment I don't have access to live API, so I'm using data provided in documentation, and work on test cases with stub api client.

cezio commented 6 years ago

current tests and coverage:

====================================================================== test session starts ======================================================================
platform linux -- Python 3.6.6, pytest-3.9.2, py-1.7.0, pluggy-0.8.0 -- /mnt/work/cezio/geosolutions/repos/pyfulcrum/lib/venv/bin/python
cachedir: .pytest_cache
rootdir: /mnt/work/cezio/geosolutions/repos/pyfulcrum/lib, inifile: setup.cfg
plugins: cov-2.6.0
collected 7 items                                                                                                                                               

src/pyfulcrum/lib/tests/test_models.py::ModelsTestCase::test_forms PASSED
src/pyfulcrum/lib/tests/test_models.py::ModelsTestCase::test_media PASSED
src/pyfulcrum/lib/tests/test_models.py::ModelsTestCase::test_projects PASSED
src/pyfulcrum/lib/tests/test_models.py::ModelsTestCase::test_records PASSED
src/pyfulcrum/lib/tests/test_storage.py::StorageTestCase::test_storage_local PASSED
src/pyfulcrum/lib/tests/test_storage.py::StorageTestCase::test_storage_save PASSED
src/pyfulcrum/lib/tests/test_storage.py::StorageTestCase::test_storage_url PASSED

----------- coverage: platform linux, python 3.6.6-final-0 -----------
Name                                       Stmts   Miss  Cover
--------------------------------------------------------------
src/pyfulcrum/lib/__init__.py                  2      0   100%
src/pyfulcrum/lib/api.py                     176     35    80%
src/pyfulcrum/lib/cli.py                      87     87     0%
src/pyfulcrum/lib/formats.py                  92     76    17%
src/pyfulcrum/lib/migrations/__init__.py       0      0   100%
src/pyfulcrum/lib/migrations/env.py           22     22     0%
src/pyfulcrum/lib/models.py                  256     38    85%
src/pyfulcrum/lib/storage.py                  26      1    96%
src/pyfulcrum/lib/tests/__init__.py           79     14    82%
src/pyfulcrum/lib/tests/test_models.py        37      0   100%
src/pyfulcrum/lib/tests/test_storage.py       26      0   100%
--------------------------------------------------------------
TOTAL                                        803    273    66%

cezio commented 6 years ago

Module API is fairly simple, it's described in Readme: https://github.com/cezio/pyfulcrum/tree/master/lib#pyfulcrum-api

geosolutions-it / pyfulcrum

PyBackup entrypoint #3