-
Lors de l'extraction d'une table Geopackage, PDI me retourne cette erreur sur un champ date :
Error reading features :Unparseable date: "Thu Aug 31 02:00:00 CEST 2023"
2024/05/02 11:41:14 - Extract…
-
hi,
I was setting a test site and playing with trafilatura and found a weird bug.
site URL:
`https://milkfriends.s1-tastewp.com/2024/06/27/ok-this/`
as this test site is only available for 2 d…
-
A "how to page" which describes what information is available on the tracker and how to access different types of data:
- costs
- levels of information extraction
- plans
- description on how …
-
**This issue is exclusively to track issues with SOPN Table Extraction.**
For SOPN Parsing: Table Parsing Errors, go here: https://github.com/DemocracyClub/yournextrepresentative/issues/1728
For SO…
-
## Problem Description
When using the sample provided by the llmware project, I've encountered issues with the accuracy of table extractions. Specifically, not all tables are being extracted correctl…
-
The engine used for extraction of tables from PDF files is a well-known Python library called camelot. However, this library requires that the processed PDF file contains text ("computer" text, not ju…
-
```
{{ user.name }}
{{ user.age }}
{{ user.job }}
```
This only extracts "Name" and "Job" (every alternate). If I use `{{ }}` around the string with the `translate` filter, then it works, but this br…
-
Seems to have stopped working from the 2020-03-27 10am release forward
-
https://github.com/camelot-dev/camelot/wiki/Comparison-with-other-PDF-Table-Extraction-libraries-and-tools
-
Attached PDF file is not processed correctly:
- in Stream mode, Tabula does not recognize the last column (col 4)
- cell data from col 4 is extracted but merged into text from col 3
- error occurs …