Closed SmritiSatyan closed 4 years ago
Facing the same error
I was unable to get this to work. Found a workaround. Instead of manually creating the CSV file, I stored my data in separate PDFs. Next, I ran the pdf_converter function on this directory which contains all the PDFs. This generated a dataframe for me. In the dataframe, every row corresponds to a single PDF.
@SmritiSatyan ,
How was the structure of your dataframe when you removed the literal_eval
function? Did it keep the list structure for the paragraphs columns?
it should be like that:
paragraphs |
---|
[Paragraph 1 of Article, ... , Paragraph N of Article] |
@andrelmfarias
I manually prepared the CSV and when I sent it to
df = pd.read_csv('path to csv')
it generated a dataframe that had 2 columns- title and paragraph. The structure of the 'paragraphs' columns looked exactly like what you have mentioned. Yes, the list structure for paragraphs was maintained.
Hello, I see that the data frame in the format (title, paragraphs) can be generated with the help of converters or manually. I scraped the inspectapedia data and stored it in a text file. I manually extracted the data and stored it in [title, paragraph] columns of a CSV file. But I am unable to read the CSV file. Getting the below error:
SyntaxError: invalid syntax
When I tried to remove the literal_eval parameter, I get the below error:
ValueError: zero-dimensional arrays cannot be concatenated
Any help on this would be appreciated