Closed atdservicebot closed 3 years ago
Here's the GRIDSMART open dataset.
It's not possible to use SQL queries against the OData endpoint, but we should be able to use the native Socrata API $query to write a query which aggregates this data into 1-hour bins.
@SurbhiBakshi did some preliminary research to understand PowerBI's capability for consuming Socrata's native JSON
endpoint vs the O-Data endpoint (what the MMC is currently using). We should be able to walk Lance through this transition.
Two tasks at hand:
JSON
endpoint. @SurbhiBakshi do you think you can continue your research this week? Would be good to know before we move ahead with writing the SQL query~@mateoclarke @amenity flagging this for the next sprint planning~
Nevermind—going to ask UTCTR to take a crack at it.
I asked Ken @ CTR if he could work on this. I believe we're going to have a meeting to discuss. Lance is in the loop.
@johnclary - yes will continue looking into this issue, I think documenting what I find out will help future reporting needs.
@SurbhiBakshi here's a query you can use which should be fairly close to what we settle on:
https://data.austintexas.gov/resource/sh59-i6y9.json?$query=select atd_device_id, intersection_name, direction, volume, date_trunc_ymd(read_date) as date, date_extract_hh(read_date) as hour where atd_device_id = '6382' and date_extract_y(read_date) > 2019 |> select atd_device_id, intersection_name, direction, date, sum(volume) as volume, hour group by atd_device_id, intersection_name, direction, date, hour limit 999999999999
Email to AMD and CTR:
I was able to dig up an old wavetronix query and re-purpose it to bin the GRIDSMART data.
It needs a bit more work, but we should be able to knock this out. Lance/Allyson, I've attached a CSV of the output.
This is only for one device, but you get the idea. Would you review CSV and let us know what changes or additional columns you'll need? Once we have the query pinned down we can walk you through how to implement it in PowerBI.
The query, for reference:
https://data.austintexas.gov/resource/sh59-i6y9.json?$query=select atd_device_id, intersection_name, direction, volume, date_trunc_ymd(read_date) as date, date_extract_hh(read_date) as hour where atd_device_id = '6382' and date_extract_y(read_date) > 2019 |> select atd_device_id, intersection_name, direction, date, sum(volume) as volume, hour group by atd_device_id, intersection_name, direction, date, hour limit 999999999999
@johnclary - the query worked and did not need any tweaking in Power BI.
Here is the Power BI report. I was able to use the query without any adjustments, and set it to refresh a few times over the day. The data is getting refreshed.
Thanks @SurbhiBakshi. Email sent to MMC below.
I've updated the query to include movement, day of week, and include only the device IDs of interest. This works out to ~1m rows of data.
Surbhi has done some testing in PowerBI to figure out how to ingest the data, because you must connect to it with a generic web connection instead of the OData connection. Surbhi—can you walk Lance and Allyson through this process? We can setup a meeting if needed, but it sounded fairly straightforward.
Hopefully this works smoothy. If it doesn't, we can explore Ken's suggestion to create a Socrata dataset "view" of the aggregated values.
Here's the updated query:
https://data.austintexas.gov/resource/sh59-i6y9.json?$query=select atd_device_id, intersection_name, direction, movement, volume, date_trunc_ymd(read_date) as date, date_extract_hh(read_date) as hour, date_extract_dow(read_date) as dow where atd_device_id in ('6170', '6177', '6211', '6351', '6382', '6736', '6881', '6882', '7014', '7015', '7038') and date_extract_y(read_date) > 2019 |> select atd_device_id, intersection_name, direction, movement, date,dow, sum(volume) as volume, hour group by atd_device_id, intersection_name, direction, movement, date, hour, dow limit 999999999999999
I documented the process to connect to an Open Data Portal dataset using the Web Connector in GitBook and emailed the link to the MMC.
Email to the MMC - Here is some documentation on connecting to the dataset using the Web Connector. Let me know if you have any questions.
@SurbhiBakshi thanks for your help with this. Closing.
Other / Not Sure
Currently we are using Power BI (with Surbhi's help) to analyze the Gridsmart data in Socrata via an OData connection. The Gridsmart data is binned in 15 min categories and despite our best efforts in PowerQuery to limit our data size, it is large enough that automated refreshes fail.
Can you help us develop a Socrata Query (suggested by John Clary) to help us limit the data that Power BI is having to import to just the data we require? Below is how we'd like to transform the data.
-can we change it to being binned by each hour instead of 15 mins? -remove all data prior to Jan 1, 2020 -Filter to only see the ATD Device IDs shown in the attachment?
Reach out to myself or Allyson Richey with questions
Flexible — An extended timeline is OK
Discuss feasibility in the next week or so and then talk through a plan. We are getting by now, but hopefully this shouldn't be a heavy lift.
Attachment (51.5kb)
Request ID: DTS21-101745