SEED-platform / seed

Standard Energy Efficiency Data (SEED) Platform™ is a web-based application that helps organizations easily manage data on the energy performance of large groups of buildings.
Other
107 stars 55 forks source link

Geocoding Confidence in Master record are blank for merged records #2018

Closed RDmitchell closed 4 years ago

RDmitchell commented 4 years ago

Expected Behavior

I would expect that the program would show the Geocoding Confidence for the latest record in the Inventory Detail view when 2 or more records are merged

Actual Behavior

If a record is not merged with another record, the Geocoding Confidence field is displayed in the Master in the Detail View (and thus in the List View). This seems to be true in the Detail view of both the Tax Lot and Property tables Property Tab: If there is just one record in the Detail View, the program displays the Geocoding Confidence value from that record in the Master record image

However, if 2 records are merged, whether they have the same Geocoding Confidence field value or not, the Master record for that field is blank (this screen shot is actually from the production server, but the behavior is the same on dev1). image

And here is even a more anomalous case -- the Geocoding Confidence is displayed in the Property List view (and so I am assuming it would be in the Master record) image But when you look at the Master record, the Geoconfidence field is blank (!) image

Tax Lot Tab: Record that has merged records -- Geocoding Confidence field is blank in Master record image Record does not have merged records -- Geocoding confidence field in the Master record has the same value as the 1 file that has been imported image

Steps to Reproduce

Import 2 files where some records merge and some don't.

Instance Information

Instance: dev1 SHA: 6dad9f771

Instance: seedv2 (production) SHA: df21680b Org: LBNL Test 200

Files: 1 - example-data-ESPM.xlsx 1 - example-data-ESPM-UBID-NoAlerts.xlsx 1 - example-data-taxlots.xlsx

They can be found in the folder for this issue https://drive.google.com/drive/folders/1SD4Klum3wU-bgBNhasMNF3OFbuFjuCRg?usp=sharing

adrian-lara commented 4 years ago

Thinking about possible solutions for this issue, there's an interesting and complicated problem. I'm thinking there might not be an easy solution, but I'm giving this more thought and am considering partial solutions to at least have consistency in this case.

Details below to welcome ideas from others.

For a single record, the geocoding results (geocoding_confidence, latitude, longitude, and [under the hood] long_lat) correspond to the columns/values that were used for geocoding (geocoding columns). When merging two records into one resulting record, filling in empty values of one record with populated values of the other happens. Also, we have merge_protection that can be activated on a column.

This merge protection and/or missing values might prevent some or all of the geocoding results and corresponding geocoding columns from staying together. For example, consider merging a record with a confidence of "Low - check address (Ambiguous)" without latitude/longitude values to a record with both latitude and longitude. Not only would the confidence value be off for the new latitude/longitude values, but the values of the columns used for geocoding likely would not correspond with those latitude/longitude values either.

Add in merge protection, and consider other combinations of geocoded/non-geocoded records, and things could get even more complicated.

adrian-lara commented 4 years ago

I should've mentioned this in my first message but the simplest fix here would just be to allow for the geocoding columns to be separated from the geocoding results - as well as allow the components of each to be separated. (Chalk it up to being one of the "dangers" of merging?) This feels bad, but I thought it was worth mentioning.

The next best solution I can think of right now is the following: Thinking about the fact that users can change their geocoding columns at any time, it's possible for geocoding column values to "mismatch" geocoding results for any particular record anyway. Given that, I'm thinking it wouldn't be awful if geocoding column values were separated from geocoding results during merging. Thoughts?

If we only had to worry about the geocoding results sticking together, I think a reasonable solution would be to add custom logic in merging for geocoding_confidence, latitude, longitude, and long_lat values to always stick together.

All 4 would be packaged together and treated as one during merging - they would be treated like any single field. Merge protection on one of the 4 would trigger merge protection for all 4 (to the user, this would only look like 3, latitude, longitude, and geocoding_confidence). Empty values in the "base" record wouldn't just be populated if the other record has them populated, unless all 4 values in the "base" record were empty.

Writing this all out, I'm not sure how I feel about this solution. Would it be too confusing from a user perspective?

RDmitchell commented 4 years ago

I am thinking that maybe we shouldn't tackle this issue.

Seems to complicated to solve, plus would be confusing to the user.

I recommend putting this in the icebox, or closing it and documenting (Robin) it somewhere for the users how the program currently works.

adrian-lara commented 4 years ago

It is quite confusing to think about at first, but I think some action should absolutely take place since merging geocoded records just doesn't work right now. (The state of geocoding for a resulting record is completely inconsistent with what existed prior to the merge.)

I think the fix we're implementing keeps a good level of consistency and predictability around these interactions.

Specifically, the geocoding results are completely taken from only one of the merged records. The logic driving the decision of which record has their geocoding results used is consistent with how merging 2 records with overlapping column values already works.

RDmitchell commented 4 years ago

ok, sounds good. thx

RDmitchell commented 4 years ago

instance: dev1 SHA: c97042d2 Org: LBNL Test 200

This seems to be working now -- the master record is no longer blank for records with multiple files imported image