Core Prep to allow modular DataStores

jarvisms commented 5 years ago

Seperating the file storage classes from pywws.storage and allowing config to define a backend module to use.

Addition of clear(), update() and __iter__() methods to file storage classes to make data behave more like lists/dicts and to internalise the process of clearing entire datasets.

Adaptations to reprocess.py and other utility molules to account for these changes

jim-easterbrook commented 5 years ago

I thought the separation would be at a slightly lower level (CoreStore), as I assume you will still have the same raw, calib, hourly etc stores with the same keys.

Can we defer merging until you've got a complete and working alternative storage setup? I'm not planning any significant work on pywws for a while so there shouldn't be any problems merging later. If you'd like people to test your work so far you can ask them to clone your repos instead of mine.

You might also want to read PEP8 and make your contributions more compliant with it.

jarvisms commented 5 years ago

My sqlite3 backend is at the CoreStore level and I am adopting the same keys and methods so it should "look" the same to the outside world. I wanted to allow the user to choose a storage "system" (such as your original files, or sqlite, or external MySQL or whatever) and so it seemed to make sense having CoreStore, and its sub classes defined as a module in itself. In my current sqlite3 module, the sub classes (RawStore etc.) also have minor superficial changes which couldn't be built into CoreStoreto be inherited (i.e. type conversions, other internal semantics etc.), hence I've pulled all of them into their own module along with CoreStore.

I do have a working sqlite3 alternative, but want to get some better performance out of it with reprocessing.py tests since sqlite3 fundamentally doesn't handle concurrency very well and reprocessing.py will simultaneously read and write, so I'll see where that takes me. Also, my new module is currently a great example of how not to be PEP3 compliant. Unfortunately old habits and coding styles die hard! Coding is a hobby for me and this is the first time I've ever contributed or had any of my own code looked at by others so I appreciate the feedback.

jim-easterbrook commented 5 years ago

OK, I think I see where you're going. Duck typing FTW!

Don't worry too much about PEP8, you'll find plenty of examples where I don't follow it. At least you're not using TAB characters in Python source! It's your very long lines (docstrings, typically) that caught my eye. (Or didn't because they're mostly off screen.)

jarvisms commented 5 years ago

I really like how the existing data stores behave like dictionaries with additional slicing and so was very keen to maintain and build on that - plus it allows for a "drop in" replacement. It also translates over to SQL queries quite well but inevitably SQL itself has other niggles.

Admittedly I do normally use tabs, and only change when modifying existing code because Python3 moans about mixing and matching. The only "formal" teaching I ever had with any kind of programming was a little bit over a decade ago for scientific modelling in C and Fortran (Does anyone still use Fortran?) and for performance gains it sometimes benefited to write very long compact and unreadable one-liners which compile/execute faster than something which would be more readable. I'm pretty sure that's a PEP8 no-no.

jim-easterbrook / pywws

Core Prep to allow modular DataStores #74