VikParuchuri / marker

Convert PDF to markdown quickly with high accuracy
https://www.datalab.to
GNU General Public License v3.0
14.65k stars 763 forks source link

ValueError: could not convert string to float: 'True' in convert_single.py #80

Closed mrticker closed 2 months ago

mrticker commented 5 months ago

While converting https://arxiv.org/abs/2001.07685

Loaded texify model to cuda with torch.float16 dtype
    main()
  File "/marker/convert_single.py", line 21, in main
    full_text, out_meta = convert_single_pdf(fname, model_lst, max_pages=args.max_pages, parallel_factor=args.parallel_factor)
  File "/marker/marker/convert.py", line 137, in convert_single_pdf
    table_count = create_new_tables(blocks)
  File "/marker/marker/cleaners/table.py", line 82, in create_new_tables
    new_text = tabulate(table_rows, headers="firstrow", tablefmt="github")
  File "/opt/conda/lib/python3.9/site-packages/tabulate/__init__.py", line 2153, in tabulate
    cols = [
  File "/opt/conda/lib/python3.9/site-packages/tabulate/__init__.py", line 2154, in <listcomp>
    [_format(v, ct, fl_fmt, int_fmt, miss_v, has_invisible) for v in c]
  File "/opt/conda/lib/python3.9/site-packages/tabulate/__init__.py", line 2154, in <listcomp>
    [_format(v, ct, fl_fmt, int_fmt, miss_v, has_invisible) for v in c]
  File "/opt/conda/lib/python3.9/site-packages/tabulate/__init__.py", line 1232, in _format
    return format(float(val), floatfmt)
ValueError: could not convert string to float: 'True'
nunamia commented 5 months ago
image

Failed to create table: could not convert string to float: 'True' Table rows: [['CIFAR-10', 'CIFAR-100', 'SVHN', 'STL-10'], ['τ'], ['0.95'], ['λ'], ['u'], ['1'], ['µ'], ['7'], ['B'], ['64'], ['lr'], ['0.03'], ['β'], ['0.9'], ['Nesterov', 'True'], ['weight decay', '0.0005', '0.001', '0.0005', '0.0005']]

nunamia commented 5 months ago

same problem link: https://github.com/astanin/python-tabulate/issues/297

VikParuchuri commented 2 months ago

Will fix this by setting disable_numparse=True