If you've experience with Pandas and Python, then you could try: tabula-py if you're doing it from GUI currently.
Here's an example:
import tabula
# Get dataframe from pdf of page 128
df = tabula.read_pdf('budget.pdf', pages=128)
# Post processing
# Optional: Can also use R or Excel
#---------------------------------------------------
# Remove total row
df.drop([17], axis=0, inplace=True)
# Remove blank column of total
df.drop(df.columns[1], axis=1, inplace=True)
# Remove number fron STG code column
df['STG Code'] = df['STG Code'].apply(lambda x: x.split(' ', 1)[1])
#-----------------------------------------------------
# Export table to CSV
df.to_csv('data.csv', index=None)
thank you @amitness for the suggestion! I will try this. It this makes it easier to clean the data, I am going to use this. Cleaning is a hassle for the tables in the pdf.
If you've experience with Pandas and Python, then you could try: tabula-py if you're doing it from GUI currently.
Here's an example: