cagov / caldata-mdsa-caltrans-pems

CalData's MDSA project with Caltrans on Performance Measurement System (PeMS) data
https://cagov.github.io/caldata-mdsa-caltrans-pems/
MIT License
2 stars 0 forks source link

Imputation: compute from global coefficients #170

Open ian-r-rose opened 2 months ago

ian-r-rose commented 2 months ago

Perform imputation on the 5-minute unimputed data based on linear regression from all detectors in a district, as documented in the User manual.

There is also an opportunity to refine what is considered "global". What's described in the user manual is using all detectors in a district, but we could also do, e.g., all detectors in a district on the same freeway, or all detectors within a certain distance of the one in question.

image

The algorithm is described in more detail here.

Additional Information: Linear regression from neighbors based on global coefficients. This imputation method is a method similar to compute from local coefficients. The only caveat is that there are some loops which never report data. In that situation it is not possible to compute local regression coefficients. Therefore, we compute global regression coefficients. These coefficients represent the general relationship seen throughout the route.

mmmiah commented 1 month ago

We are working on this and have opened new issues #199

britt-allen commented 1 month ago

Why is this issue closed? You linked to a pull request, not a new issue @mmmiah . I don't understand how pull request 199 relates to this issue. This issue can't be closed unless a PR closes it out or we decide as a team this is unplanned work.

mmmiah commented 1 month ago

@britt-allen , alright, the pull #199 aims to fulfill the goal mentioned in this issue, but the PR #199 is in initial stage. I thought/misunderstand that you want me to close #170 based on your comments. I have edited the final goal of #199 as 'This code identified all the upstream and downstream stations within five miles buffer that is within same district, route, directions and highways. The goal of this model is to identify the stations that can be used to develop linear regression model and determine the global coefficient to impute the missing performance metrics where local regression models fails.'

junlee-analytica commented 2 weeks ago

Global coefficients changed to regional coefficients.

britt-allen commented 6 days ago

Is this issue still being worked on?

ian-r-rose commented 6 days ago

Yes, in #250