aboutcode-org / scancode.io

ScanCode.io is a server to script and automate software composition analysis pipelines with ScanPipe pipelines. This project is sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase/ Google Summer of Code, nexB and others generous sponsors!
https://scancodeio.readthedocs.io
Apache License 2.0
118 stars 88 forks source link

Can't download XLSX File #824

Closed parvjain639 closed 1 year ago

parvjain639 commented 1 year ago

For many projects, I am unable to Download XLSX File Format Report. While Downloading it shows (SERVER ERROR 500)

Screenshot 2023-07-27 111707

tdruez commented 1 year ago

Hi @parvjain639, could you provide some context in order to reproduce this issue?

parvjain639 commented 1 year ago
  1. V32.4.0
  2. Yes
  3. Debian 11.6
  4. link 1. https://hub.docker.com/_/python link 2. https://hub.docker.com/r/ioexpert/netpi-openplc
  5. Docker, Find_Vulnerabilities And Scan_Codebase
tdruez commented 1 year ago

@parvjain639 can you try the other download formats and let me know if you also have a 500 for those?

parvjain639 commented 1 year ago

Yes, I have tried downloading output in other format. As i am able to Download easily. Error shows only when i try to download in XLSX Format!

tdruez commented 1 year ago

Ok, thanks for the confirmation. I cannot reproduce so far by running docker + find_vulnerabilities pipelines on the docker://python input.

Could you look into the web container log?

docker compose logs --tail="200" web

Is there anything related to the issue?

parvjain639 commented 1 year ago

Command: docker compose logs --tail="200" web

Result: web_1 | ^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view web_1 | return self.dispatch(request, *args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch web_1 | return super().dispatch(request, *args, *kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch web_1 | return handler(request, args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/views.py", line 988, in get web_1 | output_file = output.to_xlsx(project) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx web_1 | if layers_data := docker.get_layers_data(project): web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data web_1 | image_id = image.get("image_id") web_1 | ^^^^^^^^^ web_1 | AttributeError: 'str' object has no attribute 'get' web_1 | ERROR Internal Server Error: /project/pythonlatest-c46201cd/results/xls x/ web_1 | Traceback (most recent call last): web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ex ception.py", line 55, in inner web_1 | response = get_response(request) web_1 | ^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ba se.py", line 197, in _get_response web_1 | response = wrapped_callback(request, *callback_args, callback_kwa rgs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^ web_1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner web_1 | return func(*args, *kwds) web_1 | ^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view web_1 | return self.dispatch(request, args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch web_1 | return super().dispatch(request, *args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch web_1 | return handler(request, *args, *kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/views.py", line 988, in get web_1 | output_file = output.to_xlsx(project) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx web_1 | if layers_data := docker.get_layers_data(project): web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data web_1 | image_id = image.get("image_id") web_1 | ^^^^^^^^^ web_1 | AttributeError: 'str' object has no attribute 'get' web_1 | ERROR Internal Server Error: /project/pythonlatest-c46201cd/results/xls x/ web_1 | Traceback (most recent call last): web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ex ception.py", line 55, in inner web_1 | response = get_response(request) web_1 | ^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ba se.py", line 197, in _get_response web_1 | response = wrapped_callback(request, callback_args, callback_kwa rgs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^ web_1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner web_1 | return func(*args, kwds) web_1 | ^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view web_1 | return self.dispatch(request, *args, *kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch web_1 | return super().dispatch(request, args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch web_1 | return handler(request, *args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/views.py", line 988, in get web_1 | output_file = output.to_xlsx(project) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx web_1 | if layers_data := docker.get_layers_data(project): web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data web_1 | image_id = image.get("image_id") web_1 | ^^^^^^^^^ web_1 | AttributeError: 'str' object has no attribute 'get' web_1 | ERROR Internal Server Error: /project/pythonlatest-c46201cd/results/xls x/ web_1 | Traceback (most recent call last): web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ex ception.py", line 55, in inner web_1 | response = get_response(request) web_1 | ^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ba se.py", line 197, in _get_response web_1 | response = wrapped_callback(request, *callback_args, *callback_kwa rgs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^ web_1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner web_1 | return func(args, kwds) web_1 | ^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view web_1 | return self.dispatch(request, *args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch web_1 | return super().dispatch(request, *args, *kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch web_1 | return handler(request, args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/views.py", line 988, in get web_1 | output_file = output.to_xlsx(project) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx web_1 | if layers_data := docker.get_layers_data(project): web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data web_1 | image_id = image.get("image_id") web_1 | ^^^^^^^^^ web_1 | AttributeError: 'str' object has no attribute 'get' web_1 | ERROR Internal Server Error: /project/pythonlatest-c46201cd/results/xls x/ web_1 | Traceback (most recent call last): web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ex ception.py", line 55, in inner web_1 | response = get_response(request) web_1 | ^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ba se.py", line 197, in _get_response web_1 | response = wrapped_callback(request, *callback_args, callback_kwa rgs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^ web_1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner web_1 | return func(*args, *kwds) web_1 | ^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view web_1 | return self.dispatch(request, args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch web_1 | return super().dispatch(request, *args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch web_1 | return handler(request, *args, *kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/views.py", line 988, in get web_1 | output_file = output.to_xlsx(project) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx web_1 | if layers_data := docker.get_layers_data(project): web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data web_1 | image_id = image.get("image_id") web_1 | ^^^^^^^^^ web_1 | AttributeError: 'str' object has no attribute 'get' web_1 | ERROR Internal Server Error: /project/pythonlatest-c46201cd/results/xls x/ web_1 | Traceback (most recent call last): web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ex ception.py", line 55, in inner web_1 | response = get_response(request) web_1 | ^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ba se.py", line 197, in _get_response web_1 | response = wrapped_callback(request, callback_args, callback_kwa rgs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^ web_1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner web_1 | return func(*args, kwds) web_1 | ^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view web_1 | return self.dispatch(request, *args, *kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch web_1 | return super().dispatch(request, args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch web_1 | return handler(request, *args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/views.py", line 988, in get web_1 | output_file = output.to_xlsx(project) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx web_1 | if layers_data := docker.get_layers_data(project): web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data web_1 | image_id = image.get("image_id") web_1 | ^^^^^^^^^ web_1 | AttributeError: 'str' object has no attribute 'get' web_1 | ERROR Internal Server Error: /project/redis-ff00d238/results/xlsx/ web_1 | Traceback (most recent call last): web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ex ception.py", line 55, in inner web_1 | response = get_response(request) web_1 | ^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/ba se.py", line 197, in _get_response web_1 | response = wrapped_callback(request, *callback_args, *callback_kwa rgs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^ web_1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner web_1 | return func(args, kwds) web_1 | ^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 104, in view web_1 | return self.dispatch(request, *args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/contrib/auth/mix ins.py", line 135, in dispatch web_1 | return super().dispatch(request, *args, *kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/usr/local/lib/python3.11/site-packages/django/views/generic/ba se.py", line 143, in dispatch web_1 | return handler(request, args, kwargs) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/views.py", line 988, in get web_1 | output_file = output.to_xlsx(project) web_1 | ^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/output.py", line 471, in to_xlsx web_1 | if layers_data := docker.get_layers_data(project): web_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web_1 | File "/app/scanpipe/pipes/docker.py", line 291, in get_layers_data web_1 | image_id = image.get("image_id") web_1 | ^^^^^^^^^ web_1 | AttributeError: 'str' object has no attribute 'get'

Here is the list. Output received from the command.

tdruez commented 1 year ago

Thanks, that's helpful!

link 1. https://hub.docker.com/_/python

Can you clarify the exact input you provided to the Project? Was it a docker type URL or an upload of an image downloaded from docker.com?

parvjain639 commented 1 year ago

we have used Docker type url i.e. docker://python:latest

tdruez commented 1 year ago

@parvjain639 Ok, did you make any particular configuration changes or are you running the default?

web_1 | AttributeError: 'str' object has no attribute 'get'

For some reasons the content of project.extra_data appears to be a string instead of the expected json/dict. Let's try to look at the data:

Could you access a Django shell using:

docker compose run web ./manage.py shell

Once in the shell:

from scanpipe.models import Project
project = Project.objects.get(slug="redis-ff00d238")
print(project.extra_data)
print(type(project.extra_data))
print(type(project.extra_data.get("images")))

Please provide the output of those commands, we should get a better idea about the shape of the data.

parvjain639 commented 1 year ago

We are using Default Configuration files. as no changes were made.

Screenshot 2023-07-28 102419

Could you access a Django shell using: i am not getting clear results from this!

tdruez commented 1 year ago

Thanks, it seems that one of the pipeline steps may have an impact on the extra_data structure.

Could you paste the output of:

docker compose run web ./manage.py status --project PROJECT_NAME

In your case I'm guessing the PROJECT_NAME is redis.

parvjain639 commented 1 year ago

Here is the Output of the Command!

root@debian:~/projects/scancode.io# docker compose run web ./manage.py status --project Python.Latest [+] Building 0.0s (0/0) [+] Creating 1/0 ✔ Container scancodeio-db-1 Running 0.0s [+] Building 0.0s (0/0) Project: Python.Latest Create date: Jul 19 2023 04:51 Work directory: /var/scancodeio/workspace/projects/pythonlatest-c46201cd

Database:

Inputs:

Pipelines: [FAILURE] deploy_to_develop 2023-07-19 04:51:56.84 Pipeline [deploy_to_develop] starting 2023-07-19 04:51:57.19 Step [get_inputs] starting 2023-07-19 04:51:57.20 Pipeline failed [SUCCESS] docker (executed in 4326 seconds) 2023-07-19 04:52:12.03 Pipeline [docker] starting 2023-07-19 04:52:12.29 Step [extract_images] starting 2023-07-19 04:52:24.87 Step [extract_images] completed in 13 seconds 2023-07-19 04:52:26.75 Step [extract_layers] starting 2023-07-19 04:52:47.08 Step [extract_layers] completed in 19 seconds 2023-07-19 04:52:47.10 Step [find_images_os_and_distro] starting 2023-07-19 04:52:47.10 Step [find_images_os_and_distro] completed in 0 seconds 2023-07-19 04:52:47.11 Step [collect_images_information] starting 2023-07-19 04:52:47.13 Step [collect_images_information] completed in 0 seconds 2023-07-19 04:52:47.13 Step [collect_and_create_codebase_resources] starting 2023-07-19 05:01:27.64 Step [collect_and_create_codebase_resources] completed in 520 seconds (8.7 minutes) 2023-07-19 05:01:27.65 Step [collect_and_create_system_packages] starting 2023-07-19 05:47:41.41 Step [collect_and_create_system_packages] completed in 2774 seconds (46.2 minutes) 2023-07-19 05:47:41.43 Step [flag_uninteresting_codebase_resources] starting 2023-07-19 05:47:41.85 Step [flag_uninteresting_codebase_resources] completed in 0 seconds 2023-07-19 05:47:41.86 Step [flag_empty_files] starting 2023-07-19 05:47:41.91 Step [flag_empty_files] completed in 0 seconds 2023-07-19 05:47:41.93 Step [flag_ignored_resources] starting 2023-07-19 05:47:41.93 Step [flag_ignored_resources] completed in 0 seconds 2023-07-19 05:47:41.94 Step [scan_for_application_packages] starting 2023-07-19 05:50:59.59 Step [scan_for_application_packages] completed in 198 seconds (3.3 minutes) 2023-07-19 05:50:59.60 Step [scan_for_files] starting 2023-07-19 06:04:16.61 Step [scan_for_files] completed in 797 seconds (13.3 minutes) 2023-07-19 06:04:16.62 Step [analyze_scanned_files] starting 2023-07-19 06:04:18.02 Step [analyze_scanned_files] completed in 1 seconds 2023-07-19 06:04:18.03 Step [flag_not_analyzed_codebase_resources] starting 2023-07-19 06:04:18.04 Step [flag_not_analyzed_codebase_resources] completed in 0 seconds 2023-07-19 06:04:18.05 Pipeline completed [SUCCESS] find_vulnerabilities (executed in 795 seconds) 2023-07-19 06:06:25.27 Pipeline [find_vulnerabilities] starting 2023-07-19 06:06:25.29 Step [check_vulnerablecode_service_availability] starting 2023-07-19 06:06:34.62 Step [check_vulnerablecode_service_availability] completed in 9 seconds 2023-07-19 06:06:34.64 Step [lookup_vulnerabilities] starting 2023-07-19 06:19:40.82 Step [lookup_vulnerabilities] completed in 786 seconds (13.1 minutes) 2023-07-19 06:19:40.84 Pipeline completed [FAILURE] inspect_manifest 2023-07-19 06:23:04.19 Pipeline [inspect_manifest] starting 2023-07-19 06:23:04.20 Step [get_manifest_inputs] starting 2023-07-19 06:23:04.21 Step [get_manifest_inputs] completed in 0 seconds 2023-07-19 06:23:04.22 Step [get_packages_from_manifest] starting 2023-07-19 06:23:04.26 Pipeline failed [SUCCESS] load_inventory 2023-07-19 06:23:35.16 Pipeline [load_inventory] starting 2023-07-19 06:23:35.18 Step [get_inputs] starting 2023-07-19 06:23:35.19 Step [get_inputs] completed in 0 seconds 2023-07-19 06:23:35.20 Step [build_inventory_from_scans] starting 2023-07-19 06:23:35.21 Step [build_inventory_from_scans] completed in 0 seconds 2023-07-19 06:23:35.23 Pipeline completed [SUCCESS] root_filesystems (executed in 1046 seconds) 2023-07-19 06:24:12.61 Pipeline [root_filesystems] starting 2023-07-19 06:24:12.62 Step [extract_input_files_to_codebase_directory] starting 2023-07-19 06:24:30.80 Step [extract_input_files_to_codebase_directory] completed in 18 seconds 2023-07-19 06:24:30.84 Step [find_root_filesystems] starting 2023-07-19 06:24:30.85 Step [find_root_filesystems] completed in 0 seconds 2023-07-19 06:24:30.86 Step [collect_rootfs_information] starting 2023-07-19 06:24:30.86 Step [collect_rootfs_information] completed in 0 seconds 2023-07-19 06:24:30.87 Step [collect_and_create_codebase_resources] starting 2023-07-19 06:30:22.24 Step [collect_and_create_codebase_resources] completed in 351 seconds (5.9 minutes) 2023-07-19 06:30:22.25 Step [collect_and_create_system_packages] starting 2023-07-19 06:30:22.28 Step [collect_and_create_system_packages] completed in 0 seconds 2023-07-19 06:30:22.31 Step [flag_uninteresting_codebase_resources] starting 2023-07-19 06:30:22.33 Step [flag_uninteresting_codebase_resources] completed in 0 seconds 2023-07-19 06:30:22.34 Step [flag_empty_files] starting 2023-07-19 06:30:22.35 Step [flag_empty_files] completed in 0 seconds 2023-07-19 06:30:22.36 Step [flag_ignored_resources] starting 2023-07-19 06:30:22.36 Step [flag_ignored_resources] completed in 0 seconds 2023-07-19 06:30:22.37 Step [scan_for_application_packages] starting 2023-07-19 06:33:26.71 Step [scan_for_application_packages] completed in 184 seconds (3.1 minutes) 2023-07-19 06:33:26.73 Step [match_not_analyzed_to_system_packages] starting 2023-07-19 06:37:24.29 Step [match_not_analyzed_to_system_packages] completed in 238 seconds (4.0 minutes) 2023-07-19 06:37:24.31 Step [scan_for_files] starting 2023-07-19 06:41:38.72 Step [scan_for_files] completed in 254 seconds (4.2 minutes) 2023-07-19 06:41:38.73 Step [analyze_scanned_files] starting 2023-07-19 06:41:38.86 Step [analyze_scanned_files] completed in 0 seconds 2023-07-19 06:41:38.88 Step [flag_not_analyzed_codebase_resources] starting 2023-07-19 06:41:38.90 Step [flag_not_analyzed_codebase_resources] completed in 0 seconds 2023-07-19 06:41:38.91 Pipeline completed [SUCCESS] scan_codebase (executed in 13818 seconds) 2023-07-19 06:43:47.93 Pipeline [scan_codebase] starting 2023-07-19 06:43:48.37 Step [copy_inputs_to_codebase_directory] starting 2023-07-19 06:43:51.75 Step [copy_inputs_to_codebase_directory] completed in 3 seconds 2023-07-19 06:43:51.76 Step [extract_archives] starting 2023-07-19 07:05:21.04 Step [extract_archives] completed in 1289 seconds (21.5 minutes) 2023-07-19 07:05:21.07 Step [collect_and_create_codebase_resources] starting 2023-07-19 07:21:51.36 Step [collect_and_create_codebase_resources] completed in 990 seconds (16.5 minutes) 2023-07-19 07:21:51.38 Step [flag_empty_files] starting 2023-07-19 07:21:52.18 Step [flag_empty_files] completed in 1 seconds 2023-07-19 07:21:52.22 Step [flag_ignored_resources] starting 2023-07-19 07:21:52.22 Step [flag_ignored_resources] completed in 0 seconds 2023-07-19 07:21:52.23 Step [scan_for_application_packages] starting 2023-07-19 07:48:29.04 Step [scan_for_application_packages] completed in 1597 seconds (26.6 minutes) 2023-07-19 07:48:29.06 Step [scan_for_files] starting 2023-07-19 10:34:06.14 Step [scan_for_files] completed in 9937 seconds (2.8 hours) 2023-07-19 10:34:06.16 Pipeline completed [SUCCESS] scan_package (executed in 18813 seconds) 2023-07-19 10:34:53.30 Pipeline [scan_package] starting 2023-07-19 10:34:53.72 Step [get_package_archive_input] starting 2023-07-19 10:34:53.73 Step [get_package_archive_input] completed in 0 seconds 2023-07-19 10:34:53.74 Step [collect_archive_information] starting 2023-07-19 10:35:17.63 Step [collect_archive_information] completed in 24 seconds 2023-07-19 10:35:17.66 Step [extract_archive_to_codebase_directory] starting 2023-07-19 10:35:34.32 Step [extract_archive_to_codebase_directory] completed in 17 seconds 2023-07-19 10:35:34.33 Step [run_scancode] starting 2023-07-19 15:42:22.59 Step [run_scancode] completed in 18408 seconds (5.1 hours) 2023-07-19 15:42:22.60 Step [load_inventory_from_toolkit_scan] starting 2023-07-19 15:48:13.89 Step [load_inventory_from_toolkit_scan] completed in 351 seconds (5.9 minutes) 2023-07-19 15:48:13.91 Step [make_summary_from_scan_results] starting 2023-07-19 15:48:26.56 Step [make_summary_from_scan_results] completed in 13 seconds 2023-07-19 15:48:26.57 Pipeline completed [SUCCESS] find_vulnerabilities (executed in 213 seconds) 2023-07-19 15:48:27.21 Pipeline [find_vulnerabilities] starting 2023-07-19 15:48:30.41 Step [check_vulnerablecode_service_availability] starting 2023-07-19 15:48:32.55 Step [check_vulnerablecode_service_availability] completed in 2 seconds 2023-07-19 15:48:32.57 Step [lookup_vulnerabilities] starting 2023-07-19 15:52:00.66 Step [lookup_vulnerabilities] completed in 208 seconds (3.5 minutes) 2023-07-19 15:52:00.68 Pipeline completed [FAILURE] inspect_manifest 2023-07-20 04:07:13.34 Pipeline [inspect_manifest] starting 2023-07-20 04:07:13.72 Step [get_manifest_inputs] starting 2023-07-20 04:07:13.74 Step [get_manifest_inputs] completed in 0 seconds 2023-07-20 04:07:13.75 Step [get_packages_from_manifest] starting 2023-07-20 04:07:13.79 Pipeline failed [FAILURE] deploy_to_develop 2023-07-20 07:31:29.14 Pipeline [deploy_to_develop] starting 2023-07-20 07:31:29.59 Step [get_inputs] starting 2023-07-20 07:31:29.60 Pipeline failed [SUCCESS] find_vulnerabilities (executed in 165 seconds) 2023-07-25 05:49:52.73 Pipeline [find_vulnerabilities] starting 2023-07-25 05:49:53.11 Step [check_vulnerablecode_service_availability] starting 2023-07-25 05:49:53.95 Step [check_vulnerablecode_service_availability] completed in 1 seconds 2023-07-25 05:49:53.97 Step [lookup_vulnerabilities] starting 2023-07-25 05:52:38.42 Step [lookup_vulnerabilities] completed in 164 seconds (2.7 minutes) 2023-07-25 05:52:38.43 Pipeline completed

it seems that one of the pipeline steps may have an impact on the extra_data structure. I guess you are right! As we are using each pipeline one by one. it gives XLSX error at ROOT_FILESYSTEMS Pipeline. After we have run the pipeline we are unable to download XLSX format Report.

tdruez commented 1 year ago

As we are using each pipeline one by one.

Help me to understand, why are you running all the pipelines on a single project. Each pipeline has a specific purpose and expects a certain type of input(s), for example, if I want to scan a Docker image, I use the Docker pipeline. Running everything one after the other does not make sense and is likely to cause data issues.

Could you clarify what is your initial goal here? If you want to find the vulnerabilities for a given Docker image, then the docker + find_vulnerabilities is enough.

parvjain639 commented 1 year ago

why are you running all the pipelines on a single project.

Actually, to find list of Packages, Dependencies, And Vulnerabilities of a particular project. So, we were trying to run all Pipelines in a single project.

In this case, Please let me know, when to use the following Pipelines

  1. Root_Filesystems
  2. Scan_Codebase
  3. Scan_Packages
  4. Deploy_to_Develop

How to use these Four types of Pipelines! I am unable to understand by Documentation. Can you Please guide us for the same? As we understand the other Pipelines from Documentation!

Could you clarify what is your initial goal here?

I want complete compliance report having keywords such as IP: patents, royalties, legal, ECC: export, cryptography, AI, newtech, GDRP: privacy, regulations, chatgpt, OSS: attribution, contribution, distribution streamlined obligations compliance, etc.

So, I guess I should run all pipelines for one project, Right??

tdruez commented 1 year ago

So, I guess I should run all pipelines for one project, Right??

No.

In this case, Please let me know, when to use the following Pipelines Root_Filesystems Scan_Codebase Scan_Packages Deploy_to_Develop

Those are not meant to be run side by side, but you need to chose one depending on your input:

I am unable to understand by Documentation.

Details documentation about pipelines is available at https://scancodeio.readthedocs.io/en/latest/built-in-pipelines.html#

See also the overview of the pipeline in the UI (click on a pipeline name to see the full details of the setps).

Screenshot 2023-07-28 at 13 07 58

parvjain639 commented 1 year ago

Thank You So Much For Clarifications!!!

tdruez commented 1 year ago

A fix for the initial XLSX download issues has been merged in main in https://github.com/nexB/scancode.io/commit/f44fc77f9fad1249c47b3be44076730f7aa7dd53

Also, I've improved the documentation regarding the pipeline choices: https://scancodeio.readthedocs.io/en/latest/faq.html#which-pipeline-should-i-use