fau-fablab / FabLabKasse

FabLabKasse, a Point-of-Sale Software for FabLabs and other public and trust-based workshops
https://fablabkasse.readthedocs.io
GNU General Public License v3.0
15 stars 4 forks source link

Improve kassenbuch speed #138

Closed patkan closed 8 years ago

patkan commented 8 years ago

Nachdem der Julian von der Kassenbuch-Performance begeistert war, ich auch und überhaupt alle, haben @schrolli und ich uns hingesetzt und ein paar einfache Perfomance-Upgrades eingefrickelt :-)

Ich bitte um ein Review, da es auf der Datenbank rumschreibt :-P

codecov-io commented 8 years ago

Current coverage is 28.22%

Merging #138 into development will increase coverage by 1.09%

@@           development       #138   diff @@
=============================================
  Files               54         54          
  Lines             6709       6756    +47   
  Methods              0          0          
  Messages             0          0          
  Branches             0          0          
=============================================
+ Hits              1820       1907    +87   
+ Misses            4889       4849    -40   
  Partials             0          0          

Powered by Codecov. Last updated by 3c8a7cf...321cb9d

cod3monk commented 8 years ago

A cache of the sums would be another improvement, significantly speeding up summary_to_string. I did not include a running sum into the buchung table, because it could be wrong. But if you find a way to make sure that nothing has changed up to a certain point, a cache could provide sums up to that point, if things have changed the cache needs to be updated. The buchung table is to be the authoritative source of all account balances at all times.

The cache should be kept separate, at least a separate table if not a separate database.

patkan commented 8 years ago

Wouldn't computing the sums for summary_to_string in the database be faster? (Instead of creating the buchung-objects and then adding them)

In a next step we could use a SQL-trigger to automatically compute the sum of all or certain entries on update of the table. For this to be useful we however have to come up with a scheme of how to use it.

patkan commented 8 years ago

With 3dad868 I have added trigger that record the date of last change in a table. We could then check the timestamps against the timestamp of a cache. This way we can keep the cache as long as no change has been made to the database.

schrolli commented 8 years ago

before you start caching things in seperate tables, use sql-queries instead of objects to crawl over. I can think of such a query:

SELECT
    konto,
    SUM(CASE WHEN betrag > 0 THEN betrag ELSE 0 END) as haben,
    SUM(CASE WHEN betrag < 0 THEN ABS(betrag) ELSE 0 END) as soll,
    SUM(betrag) as saldo,
FROM buchung
GROUP BY konto
ORDER BY konto asc
WHERE date < ?

If the Group-by is too slow, an option would be to outsource the konto textfield into a seperate table and referencing the konto.id in the buchung-table. Then the group-by is performed on an integer-field. But that involves database-restructuring...

to get the newest buchung, a separate query is needed:

SELECT
    beschreibung,
    datum
FROM buchung
WHERE date < ?
ORDER BY date desc
LIMIT 1

Since i have no filled database to play with, i could not verify or test things, when fuzzing in the code...

patkan commented 8 years ago

I would strip 3dad868 from this PR and the merge it. OK?

mgmax commented 8 years ago

We cannot do SUM in the database because we use Decimal and sqlite uses float.

patkan commented 8 years ago

I have removed the mentioned commit and would now say that this is ready to merge.

mgmax commented 8 years ago

now would be a good point to add a few tests that cover the date filtering and sum calculation.

schrolli commented 8 years ago

in the meantime i have got a populated database to play with and have implemented the above mentioned query. it works fine in the sense of displaying the same results as the former object-crawl version. By this i got the execution time from 1.2s down to under 0.25s. I agree to the problems of summing up many float values. But when this is a mission-breaker, i would move away from sqlite to a more sophisticated database-engine like mysql/mariadb or postgres, that have decimal maths build in.

Testing: I can imagine to provide a "test database" file with special values inserted and asserting against a preprocessed result. the values include some corner cases like a buchung with the exact datetime of an --until argument, where this buchung should not be included...

mgmax commented 8 years ago

I think the existing unittests should be amended to generate test data by making bookings and running queries. https://github.com/fau-fablab/FabLabKasse/blob/development/FabLabKasse/test_kassenbuch.py

patkan commented 8 years ago

And for the generation of data the package hypothesis should be used. This makes it easier to write tests that do not make too much assumptions.

patkan commented 8 years ago

I have added some basic unittests they should be enough for now.

patkan commented 8 years ago

Deployed to Kasse.