digital-preservation / csv-validator

CSV Validation Tool and API (CSV Schema RI)
Mozilla Public License 2.0
205 stars 55 forks source link

How to modify the text parser settings (especially for the max characters per column) ? #504

Open teki69 opened 2 months ago

teki69 commented 2 months ago

Hello, When I try to parse a CSV file having a quite large number of characters in a column, it crashes with the following error :

Error: com.univocity.parsers.common.TextParsingException: Length of parsed input (4097) exceeds the maximum number of characters defined in your parser settings (4096). Parser Configuration: CsvParserSettings: Auto configuration enabled=true Auto-closing enabled=true Autodetect column delimiter=false Autodetect quotes=false Column reordering enabled=true Delimiters for detection=null Empty value=null Escape unquoted values=false Header extraction enabled=null Headers=null Ignore leading whitespaces=false Ignore leading whitespaces in quotes=false Ignore trailing whitespaces=false Ignore trailing whitespaces in quotes=false Input buffer size=1048576 Input reading on separate thread=true Keep escape sequences=false Keep quotes=false Length of content displayed on error=-1 Line separator detection enabled=true Maximum number of characters per column=4096 Maximum number of columns=512 Normalize escaped line separators=true Null value=null Number of records to read=all Processor=none Restricting data in exceptions=false RowProcessor error handler=null Selected fields=none Skip bits as whitespace=true Skip empty lines=true Unescaped quote handling=nullFormat configuration: CsvFormat: Comment character=# Field delimiter=; Line separator (normalized)=\n Line separator sequence=\r\n Quote character=" Quote escape character=" Quote escape escape character=null

Would there be a way to modify that text parser settings in general, and (for my problem) especially this setting ? :

Maximum number of characters per column=4096