Layout-Parser / layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis
https://layout-parser.github.io/
Apache License 2.0
4.75k stars 456 forks source link

fix to issue #94 #95

Closed kforcodeai closed 2 years ago

kforcodeai commented 2 years ago

Fixes # https://github.com/Layout-Parser/layout-parser/issues/94#issue-1040387217

94

The issue was, all digit sequences were inferred as float, with this fix all text (numeric + non-numeric) will be inferred as string and the user can change it to their desired data type. But with this fix, the user will be required to change the numeric data type columns. i could not find any better solution other than this.

lolipopshock commented 2 years ago

I think the new solution can solve your issue -- see example below:

Let's say we have a csv file test.csv:

Col_A, Col_B
, 1
2, 3
245.0, 

And if we read it via:

df = pd.read_csv("test.csv", converters={"Col_A": str})

We have

Test B
  1
2 3

245.0

(There's no .0 for 2 in the 2nd row and 1st col.