Closed jluo41 closed 3 years ago
@floydluo Hello, on the tableau dashboard, I didn't find race per county, only for the state. Maybe, it's just not there in the dashboard
Hi, it's me again.
If run the above code, the race worksheet only returns the state level race data. However, if we open the link in chrome and click the county in the map, the dashboard will be updated to that county's race vaccine information. I have tried to simulated the POST method to send tableau's /select
URL to get the data. Then I got a list of values without keys. So I tried to map the values to keys based on indexes, which was not hundred percent accurate.
Here is another state problem by running your code, for the state South Carolina:
from tableauscraper import TableauScraper as TS
url = "https://public.tableau.com/views/NCDHHS_COVID-19_Dashboard_Vaccinations/Summary"
ts = TS()
ts.loads(url)
dashboard = ts.getDashboard()
#show dropdown input name
inputNames = dashboard.getDropdownInputs()
print(inputNames)
#show dropdown values for a given input name
values = dashboard.getDropdownValues("County-Map")
print(values)
dashboard = dashboard.setDropdown("County-Map", "Alamance County")
Then it report the error:
['County-Map', 'Param.Metrics Map']
['North Carolina', 'Alamance County', 'Alexander County', 'Alleghany County', 'Anson County', 'Ashe County', 'Avery County', 'Beaufort County', 'Bertie County', 'Bladen County', 'Brunswick County', 'Buncombe County', 'Burke County', 'Cabarrus County', 'Caldwell County', 'Camden County', 'Carteret County', 'Caswell County', 'Catawba County', 'Chatham County', 'Cherokee County', 'Chowan County', 'Clay County', 'Cleveland County', 'Columbus County', 'Craven County', 'Cumberland County', 'Currituck County', 'Dare County', 'Davidson County', 'Davie County', 'Duplin County', 'Durham County', 'Edgecombe County', 'Forsyth County', 'Franklin County', 'Gaston County', 'Gates County', 'Graham County', 'Granville County', 'Greene County', 'Guilford County', 'Halifax County', 'Harnett County', 'Haywood County', 'Henderson County', 'Hertford County', 'Hoke County', 'Hyde County', 'Iredell County', 'Jackson County', 'Johnston County', 'Jones County', 'Lee County', 'Lenoir County', 'Lincoln County', 'Macon County', 'Madison County', 'Martin County', 'Mcdowell County', 'Mecklenburg County', 'Mitchell County', 'Montgomery County', 'Moore County', 'Nash County', 'New Hanover County', 'Northampton County', 'Onslow County', 'Orange County', 'Pamlico County', 'Pasquotank County', 'Pender County', 'Perquimans County', 'Person County', 'Pitt County', 'Polk County', 'Randolph County', 'Richmond County', 'Robeson County', 'Rockingham County', 'Rowan County', 'Rutherford County', 'Sampson County', 'Scotland County', 'Stanly County', 'Stokes County', 'Surry County', 'Swain County', 'Transylvania County', 'Tyrrell County', 'Union County', 'Vance County', 'Wake County', 'Warren County', 'Washington County', 'Watauga County', 'Wayne County', 'Wilkes County', 'Wilson County', 'Yadkin County', 'Yancey County']
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-87-2e4d4aabf8bf> in <module>
17 print(values)
18
---> 19 dashboard = dashboard.setDropdown("County-Map", "Alamance County")
~/opt/miniconda3/envs/ox/lib/python3.9/site-packages/tableauscraper/TableauDashboard.py in setDropdown(self, inputName, value)
113 self._scraper, parameterNames[0], value
114 )
--> 115 return tableauscraper.dashboard.getWorksheetsCmdResponse(self._scraper, r)
~/opt/miniconda3/envs/ox/lib/python3.9/site-packages/tableauscraper/dashboard.py in getWorksheetsCmdResponse(TS, data)
91 output = []
92 for selectedZone in zonesWithWorksheet:
---> 93 frameData = utils.getWorksheetCmdResponse(selectedZone, dataFull)
94
95 if frameData is None:
~/opt/miniconda3/envs/ox/lib/python3.9/site-packages/tableauscraper/utils.py in getWorksheetCmdResponse(selectedZone, dataFull)
252 ]
253 if len(index["aliasIndices"]) > 0:
--> 254 frameData[f'{index["fieldCaption"]}-alias'] = [
255 onDataValue(it, t, cstring) for it in index["aliasIndices"]
256 ]
~/opt/miniconda3/envs/ox/lib/python3.9/site-packages/tableauscraper/utils.py in <listcomp>(.0)
253 if len(index["aliasIndices"]) > 0:
254 frameData[f'{index["fieldCaption"]}-alias'] = [
--> 255 onDataValue(it, t, cstring) for it in index["aliasIndices"]
256 ]
257
~/opt/miniconda3/envs/ox/lib/python3.9/site-packages/tableauscraper/utils.py in onDataValue(it, value, cstring)
157
158 def onDataValue(it, value, cstring):
--> 159 return value[it] if (it >= 0) else cstring[abs(it) - 1]
160
161
IndexError: list index out of range
It seems that the code cannot match values with keys. Do you have some suggestions?
Thank you,
Junjie
@floydluo There is a bug in the library
For the 1st case you can use the following to get the race data for a specific county in the Map
worksheet
from tableauscraper import TableauScraper as TS
url = "https://bi.wisconsin.gov/t/DHS/views/VaccinesAdministeredtoWIResidents/VaccinatedWisconsin-County"
ts = TS()
ts.loads(url)
dashboard = ts.getWorksheet("Map").select("County", "Waukesha County")
print(dashboard.getWorksheet("Race vax/unvax county").data)
Note that the following will return same error as above, I have to investigate this :
from tableauscraper import TableauScraper as TS
url = "https://bi.wisconsin.gov/t/DHS/views/VaccinesAdministeredtoWIResidents/VaccinatedWisconsin-County"
ts = TS()
ts.loads(url)
dashboard = ts.getDashboard()
for t in dashboard.worksheets:
print(f"WORKSHEET NAME : {t.name}")
# show selectable columns
columns = ts.getWorksheet("Map").getSelectableColumns()
print(columns)
# show values by column name
values = ts.getWorksheet("Map").getValues("County")
print(values)
# select that value
dashboard = ts.getWorksheet("Map").select("County", "Waukesha County")
print(dashboard.getWorksheet("Race vax/unvax county").data)
The 2nd case error may be related to the former one, but I have to check
@floydluo I've narrowed down the problem. The data dictionary that is retrieved from the dashboard looks like :
{
"0": {
"cstring": [....],
.....
}
}
But when we perform the select request, sometimes it returns the whole data + the additional one:
{
"0": {
"cstring": [....],
.....
},
"1": {
"cstring": [....],
.....
}
}
And sometimes it only returns the missing data:
{
"1": {
"cstring": [....],
.....
}
}
As the indexing is extending all the objects in the data dictionary, there is an index error.
I would need to record the data between requests. I need to perform more tests to see if it has impact when performing multiple select requests in the same session
@floydluo This is fixed in v0.0.8
1st usecase: https://replit.com/@bertrandmartel/TableauCovidWisconsin
from tableauscraper import TableauScraper as TS
url = "https://bi.wisconsin.gov/t/DHS/views/VaccinesAdministeredtoWIResidents/VaccinatedWisconsin-County"
ts = TS()
ts.loads(url)
worksheet = ts.getWorksheet("Map")
dashboard = worksheet.select("County", "Waukesha County")
print(dashboard.getWorksheet("Race vax/unvax county").data)
dashboard = worksheet.select("County", "Forest County")
print(dashboard.getWorksheet("Race vax/unvax county").data)
2nd usecase: https://replit.com/@bertrandmartel/TableauCovidNCDHHS
from tableauscraper import TableauScraper as TS
url = "https://public.tableau.com/views/NCDHHS_COVID-19_Dashboard_Vaccinations/Summary"
ts = TS()
ts.loads(url)
dashboard = ts.getDashboard()
countyDashboard = dashboard.setDropdown("County-Map", "Alamance County")
print(countyDashboard.getWorksheet("Total Card Summary").data)
metricDashboard = dashboard.setDropdown(
"Param.Metrics Map",
"People Fully Vaccinated"
)
print(metricDashboard.getWorksheet("Total Card Summary").data)
@bertrandmartel That's great! Thank you so much. I will look your source code to further understand it.
@floydluo Hello I've just fixed an issue regarding the same data recording issue. You may want to upgrade to v0.0.9 if needed
@bertrandmartel Hello, thanks for your great work. It works well for Wisconsin. The selection function is perfect to deal with this state.
However, when I turn to the South Carolina, I find the selection function doesn't work. It will return the same df
when selecting different counties.
For example
from tableauscraper import TableauScraper as TS
url = 'https://public.tableau.com/views/COVIDVaccineDashboard/RECIPIENTVIEW'
ts = TS()
ts.loads(url)
county1 = 'Greenville'
dashboard = ts.getWorksheet("Vaccine Map By SC residents PEOPLE").select('Recipient County for maps', county1)
df1 = dashboard.getWorksheet('Final Age xSex x Race REC').data
url = 'https://public.tableau.com/views/COVIDVaccineDashboard/RECIPIENTVIEW'
ts = TS()
ts.loads(url)
county2 = 'Horry'
dashboard = ts.getWorksheet("Vaccine Map By SC residents PEOPLE").select('Recipient County for maps', county2)
df2 = dashboard.getWorksheet('Final Age xSex x Race REC').data
Here df1
and df2
are exactly the same.
Besides, there is also a problem in dealing with Ohio data (I have scraped Ohio by simulating tableau's API with my dirty codes, just to share this problem with you).
Ohio needs to first Select
the county and then Drop
set the value Race
. However, the drop
method didn't work here.
You can try the following code:
from tableauscraper import TableauScraper as TS
url = "https://public.tableau.com/views/VaccineAdministrationMetricsDashboard/PublicCountyDash"
ts = TS()
ts.loads(url)
dashboard = ts.getWorksheet("New Map").select("county", 'Wyandot')
dashboard = dashboard.setDropdown('Key Metrics', "Race")
df = dashboard.getWorksheetname('New Demographics').data
Here df
is still the Age table.
Thanks!
Hello, for the 1st case, is the chart you are working with is the one titled Rate of Vaccine Recipients by County per 10,000 Residents
?
Hi, @bertrandmartel, the chart I am interested in is this one, which contains the race information.
In the code, the associated worksheet is Final Age xSex x Race REC
.
All sheets are as follows:
['Doses Admin KPI number',
'Interpretation',
'DOSES SC RES',
'Gauge Partial',
'Final Age xSex x Race REC',
'Custom Axis REC',
'Ethnicity Pie REC',
"Average Age Table REC'D",
'People for 1.1',
'Vaccine Map By SC residents PEOPLE',
'Vaccine Map By SC residents PEOPLE (3)',
'County Table People Sc Residents',
'People for 1.1 Filter',
'title 1.1',
'1/2 Doses FOR 1.1 (2)',
'1/2 Doses FOR 1.1 (3)',
'% of Phase 1A',
'Gauge 1 Text box',
'Gauge Compllete',
'Gauge 2 Text Box',
'People Completed for 1.1',
'Census link',
'% of Doses Admin SC to SC residents']
Oh I see, so this is not when you click on a county on the map that it changes the data. But when using the dropdown. In fact, I called it setDropDown but it's actually a /set-parameter-value
API call
In this case it calls /categorical-filter-by-index
which is not implemented
OK, got it. And it is also hard to simulate the /categorical-filter-by-index
in python. I tried but failed. Instead, I can try the zip code
selection, which can be clicked in the map. I will try later and report the results to you.
Thanks for your great work.
For Ohio, the problem is just that the select
doesn't return back the dropdown list values (fieldCaption
with isAutoSelect
). This is not really a bug, you can use a distinct variable for each dashboard:
from tableauscraper import TableauScraper as TS
url = "https://public.tableau.com/views/VaccineAdministrationMetricsDashboard/PublicCountyDash"
ts = TS()
ts.loads(url)
dashboard = ts.getDashboard()
countryDashboard = ts.getWorksheet("New Map").select("county", 'Wyandot')
raceDashboard = dashboard.setDropdown('Key Metrics', "Race")
df = raceDashboard.getWorksheet('New Demographics').data
print(df)
Note that I use dashboard.setDropdown
and not countryDashboard.setDropdown
.
I'm not sure if I could record the presModel
between each call, this would be hard to code/maintain. For now, using separate dashboard object is sufficient I think
The code above is supposed to work but it doesn't, there is a bug with the index of tuple which is easy to fix
This is first time I've noticed a paneIndices
and columnIndices
of length 2:
{
'fn': '[system:visual].[tuple_id]',
'fnDisagg': '',
'formatStrings': [],
'isAutoSelect': False,
'paneIndices': [0, 1],
'columnIndices': [0, 0]
}
This means that this feature targets both index (0,0) and index (1,0) but in the original code I take only first index (0,0) which is not correct in this case. This is annoying because it means that for a single field you could have multiple set of values. I will see if that's easy to drop the bad index and choose the correct one
Ok the Ohio issue is fixed, I will try to deal with /categorical-filter-by-index
I've investigated for the category filter, the filter values are located in the info
JSON (not data), a custom parsing is needed. The values are indexed from 0 (those are not like the select values that are indexed using a dynamic list of tupleIds
).
This should be easy to implement. Tomorrow (sunday), I will release a new version with all the fix
Hi, @bertrandmartel, the code can handle Ohio now. For the South Carolina, I have tried to select the Zipcode instead of the County, but it also returns the same results. Maybe we have to turn to category filter
. I appreciate your quick response to the issue and I do learn a lot from your project. Cheers!
For my own experience of simulating category filter
, I can't compose the request properly and only get bad response with one short line. Look forward to your methods.
@floydluo I've made a release which implements the filter API (v0.1.0). This release also renamed a bunch of methods in order to keep things more close to the underlying Tableau API.
getDashboard
renamed to getWorkbook
getDropDownInputs
and getDropdownValues
replaces by getParameters()
setDropDown()
replaced by setParameter
getValues
and getSelectableColumns
replaced by getSelectableItems()
Additional method are: getFilters()
and setFilter
on the worksheet class
from tableauscraper import TableauScraper as TS
url = "https://public.tableau.com/views/VaccineAdministrationMetricsDashboard/PublicCountyDash"
ts = TS()
ts.loads(url)
wb = ts.getWorkbook()
ws = ts.getWorksheet("New Map")
counties = [
t["values"] for t in ws.getSelectableItems()
if t["column"] == "county"
][0]
print(counties)
wb.setParameter('Key Metrics', "Race")
for county in counties:
print(county)
wb = ws.select("county", county)
demograhics = wb.getWorksheet('New Demographics')
print(demograhics.data)
from tableauscraper import TableauScraper as TS
import pandas as pd
url = 'https://public.tableau.com/views/COVIDVaccineDashboard/RECIPIENTVIEW'
ts = TS()
ts.loads(url)
ws = ts.getWorksheet("Vaccine Map By SC residents PEOPLE")
print(ws.data)
# show filter columns and values
filters = ws.getFilters()
print(filters)
# get all counties values
counties = [
t["values"]
for t in filters
if t["column"] == "[Recipient County for maps]"
][0]
for county in counties:
print(county)
wb = ws.setFilter('[Recipient County for maps]', county)
countyWs = wb.getWorksheet("Final Age xSex x Race REC")
print(countyWs.data)
from tableauscraper import TableauScraper as TS
url = "https://bi.wisconsin.gov/t/DHS/views/VaccinesAdministeredtoWIResidents/VaccinatedWisconsin-County"
ts = TS()
ts.loads(url)
worksheet = ts.getWorksheet("Map")
dashboard = worksheet.select("County", "Waukesha County")
print(dashboard.getWorksheet("Race vax/unvax county").data)
dashboard = worksheet.select("County", "Forest County")
print(dashboard.getWorksheet("Race vax/unvax county").data)
from tableauscraper import TableauScraper as TS
url = "https://public.tableau.com/views/NCDHHS_COVID-19_Dashboard_Vaccinations/Summary"
ts = TS()
ts.loads(url)
dashboard = ts.getWorkbook()
countyDashboard = dashboard.setParameter("County-Map", "Alamance County")
print(countyDashboard.getWorksheet("Total Card Summary").data)
metricDashboard = dashboard.setParameter(
"Param.Metrics Map",
"People Fully Vaccinated"
)
print(metricDashboard.getWorksheet("Total Card Summary").data)
Great work! I will follow it up. Thank you.
@floydluo you're welcome. I've found this NYC Tableau vaccine tracker and I've setup a repo scraping the data every day with Github Actions : https://github.com/bertrandmartel/covid19-nyc-vaccine-tracker
I've implemented the goToSheet
api command for this usecase, in case you need it later, the usage is here
also this one for Idaho uses the goToSheet
command (big dashboard button)
Hi, @bertrandmartel, great work!
Here I just want to make clear about the conceptions used in Tableau and Tableau Scraper. What the relationships among Dashboard, Workbook, Worksheet? When we initialized a ts
, what does it refer to as? What is the difference between SetParameter and SelectValue? Does Workbook SelectValue and Worksheet SetParameter?
Thanks!
@floydluo Hello,
dashboard is the same as workbook. ts is just TableauScraper object that is used to extract the initial data. SetParameter
calls set-parameter-value
API here and SelectValue
calls select
API here
The setParameter
doesn't depends on the worksheet, see the payload, it just sets parameterName/parameterValue
The worksheet
depends on the worksheet, see the payload
@bertrandmartel Hi, sorry to bother again. But the code for South Caroline is dead again. It seems only one county is in the [Recipient County for maps]
now.
@floydluo I see, I've identified the problem. It seems the quickFilterDisplay
is irrelevant and I should detect filtersJson
instead which is another complex objects with schema fields and tuples fields
@floydluo this is fixed in v1.0.3, the column name is a bit different :
https://replit.com/@bertrandmartel/TableauCovidSouthCarolina
from tableauscraper import TableauScraper as TS
url = 'https://public.tableau.com/views/COVIDVaccineDashboard/RECIPIENTVIEW'
ts = TS()
ts.loads(url)
ws = ts.getWorksheet("Vaccine Map By SC residents PEOPLE")
print(ws.data)
# show filter columns and values
filters = ws.getFilters()
print(filters)
# get all counties values
counties = next(iter([
t["values"]
for t in filters
if t["column"] == "Recipient County for maps"
]))
print(counties)
for county in counties:
print(county)
wb = ws.setFilter('Recipient County for maps', county)
countyWs = wb.getWorksheet("Final Age xSex x Race REC")
print(countyWs.data)
Got it! Thanks. It works perfectly.
Hi, thanks for this awesome toolkit. I used your library to scrape an interactive map data. But I encounter a problem of getting insufficient data.
I want to scrape every county's vaccine demographic information show in this page: https://www.dhs.wisconsin.gov/covid-19/vaccine-data.htm#day
I used the following code to get the sheet.
It returned a sheet named :
Wisconsin_Race vax_unvax county.csv
. However, this sheet only contains the total state level information. How could I get each county's information by your package?Thanks.