SEED-platform / seed

Standard Energy Efficiency Data (SEED) Platform™ is a web-based application that helps organizations easily manage data on the energy performance of large groups of buildings.
Other
111 stars 54 forks source link

Improvements to UBID handling during upload #4780

Closed perryr16 closed 2 months ago

perryr16 commented 2 months ago

Any background context you want to provide?

There were a number of issues with the current UBID implementation

  1. UBID was not considered when linking across cycles
  2. If UBID was None it was discarded from the matching criteria. This it caused incorrect merging of properties. For reference, when other matching criteria is None it is not discarded.
  3. UBID history was not preserved if a property with 2 UBIDs separated by a ';' was imported on top of existing properties.
  4. UBID threshold lower limit was 0. This would allow matches to any UBID, defeating the purpose of it as a matching criterion.

What's this PR do?

  1. Adds UBID jaccard checking to the linking step.
  2. If incoming or existing UBID is None, it will not be discarded as matching criteria and be returned as "not a match".
  3. Preserves UBID history if imported file has multiple UBIDs per row.
  4. Limits UBID threshold to 0.0001 to 0.1
  5. Removes UBID as a default matching field

How should this be manually tested?

The following are sample files that may make testing simpler. The table matches the contents of the file with the suffix '_0'. The other files are identical except for their UBIDs: in file '_1' the UBIDs end with '1', in file '_none' the UBIDs are blank, and in file '_2ubids' each property has 2 UBIDs separated by ';' (85FPPRRG+5V-0-0-0-0;85FPPRRG+5V-0-0-0-5) 2props_ubids_0.xlsx 2props_ubids_1.xlsx 2props_ubids_none.xlsx 2props_2ubids.xlsx


_2props_ubids0.xlsx PM Property ID Property Name UBID Street Address Site EUI
1001 NREL CAFE 85FPPRRG+5V-0-0-0-0 15459-15593 Denver W Pkwy... 1
1002 NREL FTLB 85FPPRRG+F2-0-0-0-0 15201-15697 Denver W Pkwy... 2

For each of these testing procedures, set a strict UBID threshold (1.0) and make sure matching criteria includes pm_property_id and ubid. It's recommended to remove inventory between tests.

Procedure 1: Testing UBID linking issues

Procedure 2: Testing when UBID is None

Procedure 3: Testing with multiple UBIDs

What are the relevant tickets?

4764

4774

4775

4786

Screenshots (if appropriate)