Closed ian-r-rose closed 5 months ago
@ian-r-rose My understanding is that the null values are the result of bringing in stations that are active on a given date that do not have any raw data associated with them. The lane values come from the int_diagnostics__samples_per_station model which get them from the stg_clearinghousestation_raw model. Once we bring in the detector table, we can associate all detectors for station with their associated lane and eliminate these nulls. We would have used the stg_clearinghousestation_status model to bring in the detectors by lane but noticed that we were not getting all detectors for a station on any given date.
Hmm, so it it correct to say that all of these stations with null
values for the lane would have a 'Down/No Data' status?
My concern with having so many nulls here is that I'd like to be able to use int_diagnostics__detector_status
to filter data for all detectors that are "Good" for a given day. I suppose if all of the nulls represent detectors that are not good, we can work with it. But I think it would be nice to do some of the legwork to ensure that int_diagnostics__detector_status
has a grain of detector to more easily be able to use it. As it is, this table has data integrity problems, with about half the rows not having a valid lane number.
This will require work on the Active Stations model.
Hmm, so it it correct to say that all of these stations with
null
values for the lane would have a 'Down/No Data' status?My concern with having so many nulls here is that I'd like to be able to use
int_diagnostics__detector_status
to filter data for all detectors that are "Good" for a given day. I suppose if all of the nulls represent detectors that are not good, we can work with it. But I think it would be nice to do some of the legwork to ensure thatint_diagnostics__detector_status
has a grain of detector to more easily be able to use it. As it is, this table has data integrity problems, with about half the rows not having a valid lane number.
Yes, the stations and associated lanes would have a status of Down/No Data. I will add a where clause to reduce the number of lanes reporting status values where lanes do not exist (lane <= total lanes for a station). I will also look at incorporating the detector/lane data from the int_clearinghouse__station_status model to further reduce the number of null values. Once we have the configuration tables, we will also be able to incorporate controller data for additional statuses.
We should have controller data quite soon, so it may be worth waiting until then before implementing too many workarounds!
Based on review of the current detector_status model and the upcoming incorporation of the configuration data, the null values should be eliminated once the detector configuration data is joined with the detector_status.
This table has a significant fraction of
null
lane numbers, but it shouldn't!