Closed macticus closed 4 years ago
Hi @macticus would you mind sending a picture of your prep flow? What do you have going into the script node?
Hi nmannheimer,
That was it. I had a null input from the previous step. Because Tab Prep doesn't allow scripts as the first step (and it has no web data connector), I chose the TabPy script path as it is incredibly versatile. I created a stub data source with only one column with no data and had a script as my first step. As soon as I added one datum to the stub data source, it worked. Thanks for responding so quickly especially with the exactly correct answer!
Environment information:
Describe the issue I'm attempting to connect to Data.World to pull COVID-19 dataset using Tab Prep and the scripting step. I'm using the Data.World Python API modue DataDotWorld. This is the Tableau curated dataset. I've installed and configured the datadotworld module correctly with my API key. When I install the module and execute the integration code directly in Python IDE, it works. When I execute through Tab Prep/TabPy it returns immediately with no results. Logging is set to DEBUG and TabPy logs no relevant messages. I have tried two datadotcom methods and neither generate results.
To Reproduce Run this script. I have used two different methods and both fail.
import datadotworld as dw
def getCases(df):
results = dw.query('covid-19-data-resource-hub/covid-19-case-counts', 'SELECT * FROM covid_19_cases')
cases_df = results.dataframe
def get_output_schema(): return pd.DataFrame({ 'CASE_TYPE': prep_string(), 'CASES': prep_decimal(), 'DIFFERENCE': prep_decimal(), 'DATE': prep_date(), 'COUNTRY_REGION': prep_string(),
'PROVINCE_STATE': prep_string(), 'ADMIN2': prep_string(), 'COMBINED_KEY': prep_string(), 'FIPS': prep_decimal(), 'LAT': prep_decimal(), 'LONG': prep_decimal(), 'TABLE_NAMES': prep_string(), 'PREP_FLOW_RUNTIME': prep_date(),
})
Expected behavior Method should access data source on DataDotWorld and return to TabPy as a pandas data frame. If there are exceptions during execution, they should be handled and returned to logger. If there are limitations in functionality, they should be documented. The code runs perfectly in Python IDE.