aboutcode-org / scancode.io

ScanCode.io is a server to script and automate software composition analysis pipelines with ScanPipe pipelines. This project is sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase/ Google Summer of Code, nexB and others generous sponsors!
https://scancodeio.readthedocs.io
Apache License 2.0
109 stars 85 forks source link

XSLX output is truncated #1315

Open pombredanne opened 3 months ago

pombredanne commented 3 months ago

If I have a project with over (1,048,576 - 1) resources, loading this in LibreOffice will lead to silent truncation because it goes over 1,048,576 maximum rows of LibreOffice.

It would be great to either:

  1. include an error message in the spreadsheet
  2. split any long worksheets: for instance RESOURCE with 1,048,575 resources then RESOURCE1 with another 1,048,575 and so on.

In practice codebases with million+ files do (unfortunately) exist

mjherzog commented 3 months ago

Excel is not silent on this problem - it will report a warning. Perhaps there is a setting in LibreOffice. If we are going to deal with this in SCIO then we need some way to alert the user about the problem when creating the XLSX output and ideally define logical chunks from SCIO rather than arbitrary chunks based on size.

pombredanne commented 3 months ago

@mjherzog I have pushed a fix in a branch and we now create multiple tabs if needed