"If I Had a Million Dollars”

yuanweij commented 1 year ago

Du Bois, W.E.B. 1932. “If I Had a Million Dollars” The Crisis. 39(11): 347.

Proposed section: Education If I Had a Million Dollars.pdf If I Had a Million Dollars.md

Note: GPT-4 rocks with OCR.

nealcaren commented 1 year ago

Can you share your GPT-4 prompts for OCRing?

yuanweij commented 1 year ago

I used GPT-4 with the ChatOCR plugin. As for the prompts, it is usually an iterative process for each article. Here are some examples:

You are going to convert image or PDF files to text using ChatOCR. Here I have the image of an article. Try not to merge the columns. Please avoid unnecessary line breaks. Insert an empty line between paragraphs. Write the text in a markdown code block. https://www.staf.ai/api/files?fileId=b28b9146

Try this one: https://www.staf.ai/api/files?fileId=db5b5f40. It is a bit tricky. it has three columns and includes 1 figure and 2 tables. Please ignore the figure for now. But do make sure to extract the tables.

There are 3 paragraphs and 2 tables. One of the tables is laid out vertically. Please leave out the tables for now https://www.staf.ai/api/files?fileId=da445043

Now extract the table that is laid out vertically. Its title is Voters-South-1920. It has 5 rows representing 5 states and 5 columns.

nealcaren / fightordie

"If I Had a Million Dollars” #13