Open chinyeungli opened 3 months ago
@chinyeungli The Truncated description is the expected behavior with XLSX outputs.
From https://github.com/nexB/scancode.io/blob/main/scanpipe/pipes/output.py#L384
- Truncate the "description" field to the first five lines.
if fieldname == "description":
max_description_lines = 5
value = "\n".join(value.splitlines(False)[:max_description_lines])
@pombredanne Could you provide some insight on those design decisions?
@chinyeungli As you can see, the XLSX format has some limitations and is not ideal for sharing "scan data". If your main concern is data integrity, the JSON format is preferred.
Convert the value to a string and perform these adaptations:
- Keep only unique values in lists, preserving ordering.
- Truncate the "description" field to the first five lines.
- Truncate any field too long to fit in an XLSX cell and report error.
- Create a combined license expression for expressions
- Normalize line endings
- Truncate the value to a
maximum_length
supported by XLSX.
Describe the bug The description field from the JSON output is as follow:
However, in the XLSX output, it got trancated in some of the newline character: