Sqlite database - Githubissues

USAFPride commented 5 years ago

cleanup due to "hurry up" coding.

alaudet commented 5 years ago

Thanks for the pull request

i have started looking at your code this morning. This is going to take me a while as, of course, I need to fully dig into the change and understand not only what the change is but also what this will impact going forward. Just a couple of quick notes to see if I am on the right track. I would appreciate if you could confirm my assumptions first before we start looking at the change itself.

you propose;

Giving the user an option to use a database or a csv file by adding a database = 1 or 0 in raspisump.conf
reading.py ... checking the conf file and logging the date, time, and reading to the raspi_sump.db file located in /home/pi/raspi-sump/data
reading.py passes the reading log.log_database

if configs["use_database"] == 1:
    log.log_database(water_depth, database_name)

database.py handles the actual creation of the database if it doesn't exist. If it exists it adds the reading.

Assumptions

The database continues to grow and will grow indefinitely.

Out of scope for this PR All other components of the program which pull data from the csv (such as charts, heartbeat alerts, determining if an alert should be sent or not).
This is ok by the way. One bite at a time. :-)

Observations Tackling the database portion is taking a big bite right at the beginning, it is central to the whole application. I am starting to see how much work will be required to polish it up. I will get more into my observations of your code as we move forward, but I see quite a bit of things that need to happen before merging into V2devel which should be working code very close to merging into V2master. So a sqlite branch makes sense to do that work.

Question Does your code work right now. i see at least one syntax error in log.py that will throw an exception. I have not actually tested anything yet.

USAFPride commented 5 years ago

I'll start backwards. I didn't have a chance to try the code as I was running out the door, but I've fixed the missing : in log.py and re-pushed :). It should work as written as most of it is code from other projects I have.

All your other assumptions are correct. The database will grow indefinitely, but it won't grow all that quick with 2 fields being written. And you are correct in that all other items that are relying on the CSV files will need to be rewritten to use SQL queries to get the information. I have 2 helper functions that can be added to database.py to handle getting data out. They'll need a little help to merge into this project, but should be pretty easy. As you pointed out, the database is the base of the change, so I wanted to start there before building the rest of the pyramid.

We could probably skip the log.py and go direct from reading.py to the database write, but the project seemed to run all file interactions through log.py, so I kept it the same.


# SQL Helper Functions
def getSQL_scalar ( sql):
        conn = sqlite3.connect( sqliteDB)
        c = conn.cursor()
       c.execute("PRAGMA journal_mode=WAL")
        c.execute ( sql)
        return c.fetchone()[0]

def getSQL_multiple ( sql):
        conn = sqlite3.connect( sqliteDB)
        c = conn.cursor()
        c.execute("PRAGMA journal_mode=WAL")
        c.execute ( sql)
        mult = c.fetchall()
        return mult```

alaudet commented 5 years ago

Code seems fairly straightforward but there are still errors in database.py.

This is one of the reasons that I try to do test driven development. Not sure if you have used any testing frameworks (unittest, nosetest, pytest) or anything. I have settled on nose which is a wrapper for unittest that simplifies the syntax quite a bit.

I find it essential when coding to be able to simply run the tests to ensure any change I make doesn't break anything. Its a little more code to write but saves so much time in the long run. I never merge anything until all nosetests have passed.

As is your code will not run, because of some undefined variables in database.py. A couple of variables have typos in them and this is something test code would have picked up.

The first thing I want to do when downloading your code is have a quick look and then run it. So going forward any contributions need to follow a couple of guidelines.

code has to work before issuing a pull request.
Commit messages need to be clear so that other people looking at it know where to look to see changes
nosetests are encouraged, I am going to have to end up writing them anyway
all code needs to be eligible to be released under the MIT license.

I do have some Contributing Guidelines for raspisump.

no.3 in the guidelines doesn't really apply because we are doing a new version so i expect to break compatibility with version 1. 4 and 6 are very important to me.

I appreciate the contribution and hope you are ok with the approach, I code in my spare time and I really am quite limited in the time that I have (usually an hour or two early morning). So I think it is fair that we communicate constantly on any collaboration and that the code is clean when we pass it back an forth.

alaudet commented 5 years ago

We can use this issue in the tracker to discuss all of the work around Sqlite3.

alaudet commented 5 years ago

Thanks for the breakdown, I didn't look at it any further because I saw some of the errors just reading it. I really need to make some time to work on this. The summer has been difficult to devote any time at all.

rtaft commented 5 years ago

I know the feeling, got to do the outside work you can while it's still summer. I'm sort of developing something similar just with multiple sensors and a float alarm for my septic effluent tank, though I'm contemplating raspi-sump again and separating the float alarm code. I was trying to get familiar with the code to see how it was coming.

USAFPride commented 5 years ago

Sorry for the delay in responding as I've been places that don't have internet. I never really intended this as a full pull request. I to am a part time coder and have spent significant time in the past coding something up for another project only to have it not used. This was more to get discussion rolling as it is a significant change to the structure and if the owner is not on board from the beginning about the changes, then there is no reason to go further.

Significant points have been brought up that make sense:

Multiple sensors - does this lead into multiple alarms? It's easy to add a sensor name field, but what other issues does this bring up. The alarm code would definitely need to be modified.
Reference: http://www.sqlitetutorial.net/sqlite-date/ for working with datetime. I (and many others) have existing weather databases that is working just fine with now() (and almost that line exactly). time.time() will return epoch time and is not human readable. Most people prefer now(). This comes from working python code and database and SQLITE3 handles datetime datatype just fine (albeit converted to text). Graphing is quite simple using the database format. Many projects use this format. eg: https://www.hackster.io/mjrobot/from-data-to-graph-a-web-journey-with-flask-and-sqlite-4dba35
I realize there is no primary key at this point, we may decide that there should be, so I left the code in as it doesn't cause any issues. Just future protection. Same with the drop table. While coded for standalone use with raspi-sump, I would modify it for my own personal use to have 1 database and multiple tables for different projects. This can be accomplished with #4 below. If doing it that way, you wouldn't want to lose another unrelated table. Again, future protection.
I chose the default database location and name based off of the existing structure. The existing csv files do not give you an option of naming convention nor file location. Doesn't matter to me where they end up or how it is determined.

rtaft commented 5 years ago

Yes, there would need to be refactoring to support multiple sensors. I originally bought a JSN-SR04T but then read that it was inaccurate, so I ordered a pack of HC-SR04 as well and installed both. My tank is buried so I figured having a second as a backup would be a good idea but the HC-SR04 started giving me bad readings as soon as I sealed the tank (probably a mounting issue). So at this point I'm only tracking one. I saw the ticket for converting to SQLite and was going to write it but you beat me to it.

For the date, I understand your points, there are tradeoffs either way. Epoch loses readability, but it has size and speed advantages. Storing it as a string will be 24 bytes, as a number is 8 bytes. If you are storing data once a minute for 10 years, that's 120MB vs 40MB of storage for the date. If you are searching the data, it would take 3x longer to search the text version (unless you index the column). The web interface would be where you want to read the data. Other DBs store it as a number but display it so you can read it, idk why SQLite does not. That brings up 2 more suggestions:

Adding a function to retrieve the data, by start and end date. This would be utilized by the web interface, so you can ask for graphs by date ranges.
It may be beneficial to index the date column if we are searching by it. This would resolve the any speed related issues looking up data when the database gets larger over time.

Technically there is a primary key, I think it uses ROWID if you don't specify one. As long as you never try to manually set the primary key (it auto increments in sqlite), that error will not appear.

USAFPride commented 5 years ago

Below is working code to retrieve rain value from the start of today local time (insDate is setup as datetime). SQLITE3 works with all normal datetime SQL functions. It's quite easy to modify for specific dates/time periods.

"select max(rainGauge) - min(rainGauge) from " + hxTablename + " where insDate > datetime('now', 'localtime', 'start of day') and insDate <= datetime('now', 'localtime') ;"

While theoretically it would be smaller and maybe faster, in this type of application, you will truly never see the speed difference. I just performed a query on my weather database to retrieve data from a specific date 2 years ago and the speed to perform that data and from a brand new database was almost negligible. If you are counting mSecs, its the processor that you are running it on that has more of an impact than dbase size.

alaudet / raspi-sump

Sqlite database #54