emoncms / MyHomeEnergyPlanner

My Home Energy Planner - Open Source home energy assessment software based on emoncms framework + openbem
GNU Affero General Public License v3.0
22 stars 27 forks source link

Move assessment json data object in database to tables #420

Closed cagabi closed 4 years ago

cagabi commented 5 years ago

Slack conversation 21-06-19

anna [3 days ago] do you think having the whole assessment stored as a single json blob still makes sense? I guess I've wondered if this is one of the things that makes it harder to change the schema without data loss bugs. a) do you think that's true? b) would make sense to do this as part of the port/rewrite? I can see arguments in all directions

anna [3 days ago] my thinking was just that Django has pretty good migration tools and maybe the data is easier to extract/share/alter/migrate using those rather than manipulating JSON trees

anna [3 days ago] but maybe that is also the kind of work that, if it would make sense, could be done after the port, incrementally

cagabi [3 days ago] There are many many data to save, how would you put it in a table?

anna [3 days ago] well, I guess you'd have more than one table

cagabi [3 days ago] The backend doeesn't use it at all.

anna [3 days ago] right, I see

anna [3 days ago] but you could expose it to the frontend using an API quite straightforwardly. I just think you get a lot of guarantees on data integrity when you store things in a database instead of a JSON blob

anna [3 days ago] but if you don't think it makes sense, fair enough

cagabi [3 days ago] This is an example of an input object to the model. it can be much bigger with many more properties https://github.com/carboncoop/openBEM/blob/master/model-dataIn-examples.js model-dataIn-examples.js


var dataIn_model_r9 = {
   altituda: 10,
   region: 7,
   floors: [
 Show more
carboncoop/openBEM | Added by GitHub

cagabi  [3 days ago]
It may makes sense so happy to discuss the option. It's ok

anna  [3 days ago]
for example, just looking at the Library, it holds elements - I guess normally in a database you would have a Library table and an Elements table and you'd link them with a relationship.  I don't know what an Element looks like but if it's a fixed shape then it might make sense to store its data in database columns

anna  [3 days ago]
is that an assessment?  yeah, that is pretty complex

cagabi  [3 days ago]
So having a look at that example if youu think it still makes sense to put it into tables we can think about how to do it

anna  [3 days ago]
that looks out of scope for changing in this project

anna  [3 days ago]
so my sense is that each of the sub-objects in that object could be their own table

anna  [3 days ago]
e.g. fabric, ventilation, heating_systems.  especially the things that can be lists (edited)

anna  [3 days ago]
I assume there is stuff in an assessment that is taken from the libraries?

cagabi  [3 days ago]
I think these are all very valid considerations but for the migration to django I would leave it without this as it is not essential for the job

anna  [3 days ago]
yeah I think so too!

cagabi  [3 days ago]
for the libraries, each element in the library has it's own fields due to the different nature of each library. Then when the asssessor uses and element that element gets added to the assessment data. Is taht what you are asking

anna  [3 days ago]
yeah, basically

anna  [3 days ago]
OK. well I assume there are different types of elements? it's not that every element has a totally different format? (edited)

cagabi  [3 days ago]
there are different types of libraries, like walls or heating systems. All the walls in the "walls" llibrary have the same fields and all the systems in  the "heating systems" library too but  a wall element is different for a heating system. So it would be the case of having one table per type of library

anna  [3 days ago]
right

anna  [3 days ago]
well, I'm not gonna say upfront that there would be a huge difference in switching to using database tables. but I suspect in the long run it would be more maintainable

peter  [3 days ago]
aye, but it would be a much bigger job I suppose

anna  [3 days ago]
like, if you add a field to a table then Django's migrations will pick it up.  If you need to modify all the data in a table to fit a new format or make an update, you can do that in a migration pretty easily and it's all built in.  and all the data is guaranteed to be in one format, it would be much harder for a bug to trash your data

anna  [3 days ago]
but yeah, I agree it would be a much bigger job. worth thinking about in the future though

peter  [3 days ago]
yes, no schema is a nightmare

cagabi  [3 days ago]
Blurb coming: a possible limitation i can think now for moving the input data object to tables is the limitation of it's rigidity and how it may affect compatiblity with future versions of SAP.
We don't want to update the old assessments data when we change the inputs of the model. If we wanted, I see your point and how the migration would let us do it.
I have in fact being working last week in  a feature to be able to use diffferent model versions (which may mean different data objects) in the same version of the tool. This is because they want to be able to load old assessments in the future but the assessments have to remain as they were when they were generated.
There are some dicussions of what would happen if the inputs to the model changed in new versions as this would require changing the user interface. I have kind of sorted that problem in my head.
My point is that the only way to keep data inputs for different versions of the model in the same database is to save them in the big json blob. Or another option would be to actually archive a database of assessments and MHEP when there is a change in the model inputs

cagabi  [3 days ago]
https://github.com/emoncms/MyHomeEnergyPlanner/issues/410
marianneURBED
#410 Archiving Facility - for good record keeping
Before we make more major changes to the model we should make sure we can keep archive/record versions of old assessments - including the model data that was used to create them.

This is so we can go back to assessments later when householders have queries and be able to adjust/check them.

We need to be absolutely clear on things like the grid carbon intensities assumed at the time the assessment was done etc (could be very confusing to unpick if you have a PDF of an assessment that shows different results when this changes/ updates for example).
Labels
For release, High priority, enhancement
Comments
1
emoncms/MyHomeEnergyPlanner | Mar 30th | Added by GitHub

cagabi  [3 days ago]
This gets messy :stuck_out_tongue:

anna  [3 days ago]
yeah, I can see that.  there are totally ways to handle it though... either you have separate tables + libraries + etc for SAP4 vs SAP5, or you allow your tables to model both of them with a flag and clone the data

anna  [3 days ago]
you can also do stuff like store every change to a data record as a new revision (a new row) and then when you reference them somewhere else, store the revision number you're referencing. that way you can allow people to edit libraries and update values without losing old reports. but it would be easy to regenerate reports using the new data from libraries or whatever

anna  [3 days ago]
a database is just a typed and relational data store. anything you can model in JSON you can model in databases too! and vice versa

anna  [3 days ago]
but databases give you a bunch more stuff for free than JSON does
ghost commented 4 years ago

No longer relevant.