Closed RwHendrickson closed 12 months ago
See Cleaning PurpleAir Station Data section of Notebooks/1_QAQC/2_PurpleAir_Stations.ipynb
I am working on getting an updated sensor list from the City as well.
Performed the changes to the function here 9e647de But another question is what happens if sensors go down for longer than an hour? Should we do anything different?
One thing that I've only just realized is that PurpleAir is no longer free - https://community.purpleair.com/t/api-pricing/4523!
They say that they're flexible and could reward free data pulls to sensor owners (and potentially orgs that owners refer to).
Emailing them about it, but here's some quick monthly calculations for our current model (we may need to reconsider some things - this calculation takes out channel_state from the regular query):
Points Per 10min Query = 285pts = get_sensors (5pts) + (pm2.5_atm (2pts) + channel_flags (1pt) + last_seen (1pt) ) * 70 sensors
Points Per Day = 41,040 pts = 285pts Per 10mins * 144 (10mins/day)
Cost Per Month = 8.21 dollars = 41,040 pts per day * 30 days / 150,000 pts per dollar
Response from PurpleAir:
Your calculation looks to be accurate. I have included a tip on reducing it even more using the "Modified Since" field and some other Best Practices. Please don't hesitate to reach out with any additional questions.
Use the parameter "modified_since" and set it to the time of your last GET request. This will ensure that you only receive data for sensors that have reported new data since your last API call.
There are many fields you include in every request that rarely change. We recommend that you obtain these fields less frequently, as they only change when a sensor owner modifies the registration for their device. These include:
last_modified
private
name
location_type
latitude
longitude
altitude
date_created: This field will also only change when a sensor owner emails us and requests for their device to be archived
position_rating: Position rating can change either when a device's registration is modified or when a device is connected to a new WiFi network
channel_states: It's rather rare for a device to have two laser counters and for one of them to go offline and be undetectable
uptime: Many sensors can stay online for months without interruption. You may want to consider querying this less frequently than every hour
We recommend querying all of the above fields at a maximum frequency of once a day. Even then, the data is unlikely to change much day to day. However, the necessary frequency to query these fields can depend on your needs.
We also recommend querying confidence rather than confidence_auto and confidence_manual. Confidence takes into account confidence_auto and confidence_manual in its calculation.
Simplified the regular query here - 41006f1
Tried to work in modified since, but it wasn't cooperating.
Still on the to-do: What happens if sensors go down and they had an alert out?
Changed code to not query PurpleAir for previously flagged sensors - 9ad77a7 & 1646180
Through email correspondence, found out that when a City takes down a sensor our system doesn't catch it (because PurpleAir stops putting out real-time data on it entirely). Doesn't cause any failures, but something to note!
"I just archived an old sensor (“City of Minneapolis Community Air Monitoring Project 5”) that wasn’t working and installed a new sensor (“City of Minneapolis Community Air Monitoring Project 73) on Tuesday." - 11/9/23
Workflow_2
This function should check to see if sensors are down (which would not necessarily mean an ended alert). We'll need to explore channel flags a little more to accomplish this - https://api.purpleair.com/#api-sensors-get-sensors-data
Return a list of sensors that are down along with spikes_df and runtime.