bertrandmartel / tableau-scraping

Tableau scraper python library. R and Python scripts to scrape data from Tableau viz
MIT License
126 stars 20 forks source link

measured values column is not extracted #9

Closed jluo41 closed 3 years ago

jluo41 commented 3 years ago

Hi @bertrandmartel, I met a new problem when using TS to scrape the data from the tableau of New York State. Its URL is: https://covid19vaccine.health.ny.gov/vaccine-demographic-data

The original data look like this:

image

I tried the following code to scrape the data:

from tableauscraper import TableauScraper as TS

url = "https://covid19tracker.health.ny.gov/views/Race_Ethnicity_Public/RacebyCounty"

ts = TS()
ts.loads(url)

workbook = ts.getWorkbook()

parameters = workbook.getParameters()
print(parameters)

ts = TS()
ts.loads(url)

# set parameters column / value
workbook = workbook.setParameter('Show Value as', "Number")
# display worksheets
workbook.getWorksheet('Race').data

But what I get is this:

image

So the detailed values are replaced with %all%. I don't understand why. Do you have any suggestions?

Thanks very much!

bertrandmartel commented 3 years ago

@floydluo sorry for late response, it seems %all% is normal here, it means that this rows concerns all region/counties eg as if this value were empty in a datatable. I guess the issue is more that the Measure Names-value column is not returning data but something like [federated.0nvcif80f24s4c147rz6d1mpve6p (copy)].[usr:Calculation_626563313086001155:qk] which looks like calculated fields if I'm not mistaken. Maybe the data is in other sheets ?

bertrandmartel commented 3 years ago

I've found out there is a column missing that holds the values, this column is marked as type real whereas it seems it needs to read the data from the cstring field

bertrandmartel commented 3 years ago

It is fixed in v0.1.8:

from tableauscraper import TableauScraper as TS

url = "https://covid19tracker.health.ny.gov/views/Race_Ethnicity_Public/RacebyCounty"

ts = TS()
ts.loads(url)

workbook = ts.getWorkbook()

parameters = workbook.getParameters()
print(parameters)

ts = TS()
ts.loads(url)

# set parameters column / value
workbook = workbook.setParameter('Show Value as', "Number")
# display worksheets
print(workbook.getWorksheet('Race').data)
jluo41 commented 3 years ago

@bertrandmartel That's great! Thank you.