tables=camelot.read_pdf('/Users/skatipomu/Table_Extraction_Camelot/page3.pdf',pages="all)
[table.accuracy for table in tables]
Output:
[99.99999999999997, -20.852716930856104]
I think the reason is because in compute_accuracy method in utils.py while calculating accuracy we are subtracting error percentage from 1. It is supposed to be in the range [0.0,1.0] but the errors passed on to this method contains error percentages in the range[0 to 100] which inturn is from get_table_index method. So dividing this error by 100 solved the issue for me.
def compute_accuracy(error_weights):
"""Calculates a score based on weights assigned to various
parameters and their error percentages.
Parameters
----------
error_weights : list
Two-dimensional list of the form [[p1, e1], [p2, e2], ...]
where pn is the weight assigned to list of errors en.
Sum of pn should be equal to 100.
Returns
-------
score : float
"""
SCORE_VAL = 100
try:
score = 0
if sum([ew[0] for ew in error_weights]) != SCORE_VAL:
raise ValueError("Sum of weights should be equal to 100.")
for ew in error_weights:
weight = ew[0] / len(ew[1])
for error_percentage in ew[1]:
**score += weight * (1 - error_percentage)**
except ZeroDivisionError:
score = 0
return score
from score += weight * (1 - error_percentage) to score += weight * (1 - error_percentage/100.0)
While testing I have faced a case where
table.accuracy
is negative number.PDF:page-3.pdf Code:
Output:
[99.99999999999997, -20.852716930856104]
I think the reason is because in
compute_accuracy
method in utils.py while calculating accuracy we are subtracting error percentage from 1. It is supposed to be in the range [0.0,1.0] but the errors passed on to this method contains error percentages in the range[0 to 100] which inturn is fromget_table_index
method. So dividing this error by 100 solved the issue for me.from
score += weight * (1 - error_percentage)
toscore += weight * (1 - error_percentage/100.0)