Closed froi closed 4 years ago
The dashboards seem to have different data sources:
There might be a possibility that they are A/B testing. The data seem to be the same with some tweaks to the UI of the dashboards.
Sadly there is no way of knowing. The official URL still gives a 404
Without this URL working we have no idea which we are supposed to keep looking at. If the health department has changed the domain for the dashboard we have no knowledge of the new one.
Each Dashboard link is coupled with a REST content link, identified by the Widget ID.
For example, for dashboard 2, the ID is: 3bfb64c9a91944bc8c41edd8ff27e6df, ripped from the end of the url. It's REST content link would be: https://www.arcgis.com/sharing/rest/content/items/3bfb64c9a91944bc8c41edd8ff27e6df/data
In Python, this Rest link can easily be called using the Beautiful Soup and json modules as follows.
url = 'https://www.arcgis.com/sharing/rest/content/items/3bfb64c9a91944bc8c41edd8ff27e6df/data'
from urllib.request import Request, urlopen
from bs4 import BeautifulSoup
import json
req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
html = urlopen(req)
soup = BeautifulSoup(html)
output = json.loads(soup.text)
My limited understanding of ARCGIS data structure led to my extraction pipeline being a very tailored one, snooping very slowly through what was made available. Eventually I identified the "widgets" key in the ouput json contained information of great interest.
In the case of Dashboard 2, there is a MapWidget that contains most of the tables of interest. This is not the case of Dashboard 1, at least not as well as I could tell.
Thanks @sanchobarriga (nice username BTW 😆 )
I got a couple of suggestions for you.
Since the endpoint returns JSON to begin with you might want to use the Requests library. It'll give you a cleaner way to work with JSON payloads.
Example:
import requests
url = 'https://services5.arcgis.com/klquQoHA0q9zjblu/arcgis/rest/services/Datos_Totales/FeatureServer/0/query?f=json&where=1%3D1&returnGeometry=false&spatialRel=esriSpatialRelIntersects&outFields=*&outSR=102100&resultOffset=0&resultRecordCount=50&cacheHint=true'
response = requests.get(url)
data = response.json() # Will return a Python dictionary from the JSON payload
This Issue is being marked as Stale because it has 30 days without any interaction. CC: @code4puertorico/covid19
There appears to be two separate PR Covid19 dashboards for PR
Up to now the differences seem to be visual and in data structure. The data seem to be the same.
We have no way in knowing if both are "official". The domains are both under the
bioseguridad
umbrella.