cdqa-suite / cdQA

⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering System.
https://cdqa-suite.github.io/cdQA-website/
Apache License 2.0
614 stars 191 forks source link

pdf_converter cdqa throws AttributeError: type object 'object' has no attribute 'dtype' #372

Open Meenakshi-Devi opened 1 year ago

Meenakshi-Devi commented 1 year ago

AttributeError Traceback (most recent call last) /tmp/ipykernel_55/1136373057.py in ----> 1 df = pdf_converter(directory_path='./data/pdf/') 2 df.head()

/srv/conda/envs/notebook/lib/python3.7/site-packages/cdqa/utils/converters.py in pdf_converter(directory_path, min_length, include_line_breaks) 167 if file.endswith("pdf"): 168 list_pdf.append(file) --> 169 df = pd.DataFrame(columns=["title", "paragraphs"]) 170 for i, pdf in enumerate(list_pdf): 171 try:

/srv/conda/envs/notebook/lib/python3.7/site-packages/pandas/core/frame.py in init(self, data, index, columns, dtype, copy) 408 ) 409 elif isinstance(data, dict): --> 410 mgr = init_dict(data, index, columns, dtype=dtype) 411 elif isinstance(data, ma.MaskedArray): 412 import numpy.ma.mrecords as mrecords

/srv/conda/envs/notebook/lib/python3.7/site-packages/pandas/core/internals/construction.py in init_dict(data, index, columns, dtype) 240 else: 241 nan_dtype = dtype --> 242 val = construct_1d_arraylike_from_scalar(np.nan, len(index), nan_dtype) 243 arrays.loc[missing] = [val] * missing.sum() 244

/srv/conda/envs/notebook/lib/python3.7/site-packages/pandas/core/dtypes/cast.py in construct_1d_arraylike_from_scalar(value, length, dtype) 1219 else: 1220 if not isinstance(dtype, (np.dtype, type(np.dtype))): -> 1221 dtype = dtype.dtype 1222 1223 if length and is_integer_dtype(dtype) and isna(value):

AttributeError: type object 'object' has no attribute 'dtype'