inveniosoftware-attic / jsonalchemy

JSONAlchemy
GNU General Public License v2.0
0 stars 10 forks source link

Using wrapt to avoid implementing "all the things" #29

Closed jalavik closed 9 years ago

jalavik commented 9 years ago

@jacquerie

jacquerie commented 9 years ago

I tried doing the same thing using @jirikuncar's approach: it turned out to be very easy (https://github.com/jirikuncar/jsonalchemy/commit/2480dcf5da4cdedd74217d0459119617059e9754). I think his approach of leaning heavily on jsonschema is preferable.

jalavik commented 9 years ago

@jacquerie Agreed on that observation. I favor that option. @jirikuncar @dset0x

jirikuncar commented 9 years ago

@jacquerie can somebody cherry-pick my dict branch with your updates and create new PR to pu? I'm quite busy now, but I'll try to merge it in the evening so we have good starting point for tomorrow grand finale. Thanks

jacquerie commented 9 years ago

@kaplun pointed out a performance pitfall of this approach, which I can confirm using this test: https://github.com/jacquerie/jsonalchemy/commit/171ff87ef33abb361928bbdc3bdc1ca5f2b49b64.

I get:

(jsonalchemy)~/Code/jsonalchemy$ time py.test tests/test_dictstyle.py 
================================================ test session starts ================================================
platform linux2 -- Python 2.7.8 -- py-1.4.26 -- pytest-2.7.0
rootdir: /home/jnotarst/Code/jsonalchemy, inifile: pytest.ini
plugins: pep8, cache, cov
collected 9 items 

tests/test_dictstyle.py .........
---------------------------------- coverage: platform linux2, python 2.7.8-final-0 ----------------------------------
Name                    Stmts   Miss  Cover   Missing
-----------------------------------------------------
jsonalchemy/__init__        2      0   100%   
jsonalchemy/dictstyle      82      0   100%   
jsonalchemy/factory        26     26     0%   24-65
jsonalchemy/types           0      0   100%   
jsonalchemy/utils           8      0   100%   
jsonalchemy/version         2      0   100%   
jsonalchemy/wrappers      130    130     0%   24-247
-----------------------------------------------------
TOTAL                     250    156    38%   

============================================ 9 passed in 142.13 seconds =============================================

real    2m22.377s
user    2m16.862s
sys 0m0.137s
jirikuncar commented 9 years ago

@jacquerie have you tried similar tests on @dset0x's and my version?

jacquerie commented 9 years ago

Argh, sorry for explaining myself badly: what I'm testing there is the "dictstyle" approach of @jirikuncar and @dset0x , using my own implementation of append.

kaplun commented 9 years ago

The issue at hand is that, regardless of the actual implementation, if you perform the validation automatically upon every modification this is quadratic to the number of fields (whatever the exact defintiion). If you are building a typical HEP record (with e.g. 3000 author) as you can see this can have a big performance impact.

jacquerie commented 9 years ago

In fact, the following patch fixes this performance issue: https://github.com/jacquerie/jsonalchemy/commit/d8383645646b137a5def3c2b5e85ed4ed52ab287. So, I think we will have to code carefully all the mutations and avoid, as @kaplun mentioned, having validation for every modification.

(jsonalchemy)~/Code/jsonalchemy$ time py.test tests/test_dictstyle.py 
================================================ test session starts ================================================
platform linux2 -- Python 2.7.8 -- py-1.4.26 -- pytest-2.7.0
rootdir: /home/jnotarst/Code/jsonalchemy, inifile: pytest.ini
plugins: pep8, cache, cov
collected 9 items 

tests/test_dictstyle.py .........
---------------------------------- coverage: platform linux2, python 2.7.8-final-0 ----------------------------------
Name                    Stmts   Miss  Cover   Missing
-----------------------------------------------------
jsonalchemy/__init__        2      0   100%   
jsonalchemy/dictstyle      81      1    99%   126
jsonalchemy/factory        26     26     0%   24-65
jsonalchemy/types           0      0   100%   
jsonalchemy/utils           8      0   100%   
jsonalchemy/version         2      0   100%   
jsonalchemy/wrappers      130    130     0%   24-247
-----------------------------------------------------
TOTAL                     249    157    37%   

============================================= 9 passed in 0.38 seconds ==============================================

real    0m0.597s
user    0m0.549s
sys 0m0.033s
dset0x commented 9 years ago

Just to make d838364 clear (after a conversation with @jacquerie) : This line

validate(value, self.schema['items'])

validates only the new, inserted value. This is what speeds the test up. However, as you can see this means that we have to do the checks that are broader than this value (say maxItems) manually.

This should be doable however, since jsonschema does not declare many such constraints.

kaplun commented 9 years ago

OTOH, an additional benefit to separate the validation step, is the possibility to have temporary inconsistent statuses... IMHO it should be up to the client code to decide when to enforce the validation. (e.g. imagine a situation when there are two depending fields...)

jirikuncar commented 9 years ago

Proposal

data = JSONDict({'a': []})
with data.transaction() as d:  # validation is turned off until __exit__
    for i in range(10000):
        d['a'].append(i)
kaplun commented 9 years ago

Nice proposal! Concretely: is d an instance of JSONDict where a validation flag is switched off? In this case, can the validation be switched off programmatically? E.g. to really let client code to disable it in general.

jirikuncar commented 9 years ago

a validation flag is switched off?

Don't let people to use any internals if you provide API. Then we can change the implementation while keeping compatible API.

jacquerie commented 9 years ago

Proposal

data = JSONDict({'a': []})
with data.transaction() as d:  # validation is turned off until __exit__
    for i in range(10000):
        d['a'].append(i)

Uh, that's a very cool API. :+1:

On the other hand, since I was already working on it, here's an implementation of JSONList without transactions that takes care about performance: https://github.com/jacquerie/jsonalchemy/commit/f63c27d5624d32e9d34bb53c43a14fffd779f46c

jirikuncar commented 9 years ago

As discussed we are going for dict style implementation.