IntelPython / sdc

Numba extension for compiling Pandas data frames, Intel® Scalable Dataframe Compiler
https://intelpython.github.io/sdc-doc/
BSD 2-Clause "Simplified" License
645 stars 61 forks source link

TypingError in read_csv() #941

Closed akharche closed 2 years ago

akharche commented 3 years ago

TypingError: Cannot infer resulting DataFrame from constant file or parameters.

@numba.njit(parallel=True)
def foo(dpath):
  df=pd.read_csv(dpath)
  print(df.head())
ashokei commented 3 years ago

happens with intel python 3.7 and sdc 0.37.

read_csv(unicode_type)

There are 2 candidate implementations:
  - Of which 2 did not match due to:
  Overload in function 'sdc_pandas_read_csv': File: sdc/datatypes/hpat_pandas_functions.py: Line 79.
    With argument(s): '(unicode_type)':
   Rejected as the implementation raised a specific error:                      
     TypingError: Cannot infer resulting DataFrame from constant file or parameters.            
  raised from /lib/python3.7/site-packages/sdc/datatypes/hpat_pandas_functions.py:227
kozlov-alexey commented 3 years ago

@ashokei Yes, actually, above usage of read_csv with non-constant file name (or other parameters defining resulting DF column types) is currently unsupported. But this is something we want to support in the future, so I've marked this as a feature request. But before that please use constant file names as argument of read_csv in the jitted function. You can refer to the documentation for more details: https://intelpython.github.io/sdc-doc/latest/_api_ref/pandas.read_csv.html#limitations