Open OctagonHex opened 4 years ago
While running a reporter on an ORT result with only a scan result is not forbidden, this is a use-case that is not well tested. The usual (and well tested) workflow is to first create an ORT result with an analyzer result, and then use that as the input for the scanner, which creates another ORT result file that combines the analyzer and scan results. Such "rich" ORT result files should work fine to create reports.
Maybe you can give me a hint on how to accomplish my goal. For example: I try to anaylze a unstructured directory of source code. I'll use the samples from ScanCode-toolkit. I first analyze them, and the analyzer runs OK. As expected, the result is very short and does not contain any dependencies. Now, the problem is, that If I use this as input, the scanner does not even scan the directory! The output from the anaylzer does not even cotain the source directory. The scanner result now mostly contains "No source artifact URL provided for 'Unmanaged::ScanCode-Samples:'." I also tried to add the project to a local GIT repository (without a remote master), so now the warning for "non-cacheable results" is gone, but the scanner still can't find the source code.
What parameters need to be set, so the analyzer will save where the source code was, so that the scanner can find it?
C:\oss-review-toolkit>cli\build\install\ort\bin\ort --info analyze -f JSON -i "C:\temp\ScanCode-Samples" -o analyzerOut
________ _____________________
\_____ \\______ \__ ___/ the OSS Review Toolkit, version 0.1.0-SNAPSHOT.
/ | \| _/ | | Running 'analyze' under Java 14.0.1 on Windows 10 with
/ | \ | \ | | ORT_DATA_DIR = C:\Users\USER\.ort
\_______ /____|_ / |____| OS = Windows_NT
\/ \/
More environment variables:
COMSPEC = C:\WINDOWS\system32\cmd.exe
JAVA_HOME = C:\jdk-14.0.1+7
The following package managers are activated:
Bower, Bundler, Cargo, Conan, DotNet, GoDep, GoMod, Gradle, Maven, NPM, NuGet, PhpComposer, PIP, Pipenv, Pub, SBT, Stack, Yarn
Analyzing project path:
C:\temp\ScanCode-Samples
08:16:19.253 [main] INFO org.ossreviewtoolkit.analyzer.Analyzer - Unmanaged projects found in:
08:16:19.255 [main] INFO org.ossreviewtoolkit.analyzer.Analyzer - .
08:16:19.298 [Analyzer-1] INFO org.ossreviewtoolkit.analyzer.PackageManager - Resolving Unmanaged dependencies for 'C:\temp\ScanCode-Samples'...
08:16:19.358 [Analyzer-1] INFO org.ossreviewtoolkit.utils.OrtAuthenticator - Authenticator is already installed.
08:16:19.359 [Analyzer-1] INFO org.ossreviewtoolkit.utils.OrtProxySelector - Proxy selector is already installed.
08:16:19.490 [Analyzer-1] INFO org.ossreviewtoolkit.utils.OrtAuthenticator - Authenticator is already installed.
08:16:19.491 [Analyzer-1] INFO org.ossreviewtoolkit.utils.OrtProxySelector - Proxy selector is already installed.
08:16:20.440 [Analyzer-1] WARN org.ossreviewtoolkit.analyzer.managers.Unmanaged - Analysis of local directory 'C:\temp\ScanCode-Samples' which is not under version control will produce non-cacheable results as no version for the cache key can be determined.
08:16:20.445 [Analyzer-1] INFO org.ossreviewtoolkit.analyzer.PackageManager - Resolving Unmanaged dependencies for 'ScanCode-Samples' took 1.1431624s.
Found 1 project(s) in total.
Writing analyzer result to 'analyzerOut\analyzer-result.json'.
C:\oss-review-toolkit>cli\build\install\ort\bin\ort --info scan -i analyzerOut\analyzer-result.json -o myOut
________ _____________________
\_____ \\______ \__ ___/ the OSS Review Toolkit, version 0.1.0-SNAPSHOT.
/ | \| _/ | | Running 'scan' under Java 14.0.1 on Windows 10 with
/ | \ | \ | | ORT_DATA_DIR = C:\Users\USER\.ort
\_______ /____|_ / |____| OS = Windows_NT
\/ \/
More environment variables:
COMSPEC = C:\WINDOWS\system32\cmd.exe
JAVA_HOME = C:\jdk-14.0.1+7
Using scanner 'ScanCode' with storage 'FileBasedStorage with XZCompressedLocalFileStorage backend'.
Local file storage has 0 scan results files.
08:21:00.843 [main] INFO org.ossreviewtoolkit.scanner.LocalScanner - Bootstrapping scanner 'ScanCode' as required version 3.0.2 was not found in PATH.
08:21:00.846 [main] INFO org.ossreviewtoolkit.scanner.scanners.ScanCode - Downloading ScanCode from https://github.com/nexB/scancode-toolkit/archive/v3.0.2.zip...
08:21:02.056 [main] INFO org.ossreviewtoolkit.scanner.scanners.ScanCode - Retrieved ScanCode from local cache.
08:21:02.497 [main] INFO org.ossreviewtoolkit.scanner.scanners.ScanCode - Unpacking 'C:\Users\USER\AppData\Local\Temp\ort9967510014256878867ScanCode-v3.0.2.zip' to 'C:\Users\USER\AppData\Local\Temp\ort15900786484210730018ScanCode-3.0.2'...
08:21:49.381 [main] INFO org.ossreviewtoolkit.utils.ProcessCapture - Running 'C:\Users\USER\AppData\Local\Temp\ort15900786484210730018ScanCode-3.0.2\scancode-toolkit-3.0.2\scancode.bat --version' in 'C:\Users\USER\AppData\Local\Temp\ort15900786484210730018ScanCode-3.0.2\scancode-toolkit-3.0.2'...
08:22:47.472 [main] INFO org.ossreviewtoolkit.utils.ProcessCapture - Running 'C:\Users\USER\AppData\Local\Temp\ort15900786484210730018ScanCode-3.0.2\scancode-toolkit-3.0.2\scancode.bat --version' in 'C:\Users\USER\AppData\Local\Temp\ort15900786484210730018ScanCode-3.0.2\scancode-toolkit-3.0.2'...
08:22:49.353 [FileBasedStorage with XZCompressedLocalFileStorage backend-1] INFO kotlinx.coroutines.CoroutineScope - Looking for stored scan results for Unmanaged::ScanCode-Samples: and ScannerDetails(name=ScanCode, version=3.0.2, configuration=--copyright --license --ignore *.ort.yml --info --strip-root --timeout 300 --ignore HERE_NOTICE --ignore META-INF/DEPENDENCIES --json-pp) (1/1).
08:22:49.370 [ScanCode-1] INFO kotlinx.coroutines.CoroutineScope - No stored result found for Unmanaged::ScanCode-Samples: and ScannerDetails(name=ScanCode, version=3.0.2, configuration=--copyright --license --ignore *.ort.yml --info --strip-root --timeout 300 --ignore HERE_NOTICE --ignore META-INF/DEPENDENCIES --json-pp), scanning package in thread 'ScanCode-1' (1/1).
08:22:49.373 [ScanCode-1] INFO org.ossreviewtoolkit.downloader.Downloader - Trying to download source code for 'Unmanaged::ScanCode-Samples:'.
08:22:49.377 [ScanCode-1] INFO org.ossreviewtoolkit.downloader.Downloader - Trying to download 'Unmanaged::ScanCode-Samples:' sources to 'C:\oss-review-toolkit\myOut\downloads\Unmanaged\unknown\ScanCode-Samples\unknown' from VCS...
08:22:49.380 [ScanCode-1] INFO org.ossreviewtoolkit.downloader.Downloader - Trying to download source artifact for 'Unmanaged::ScanCode-Samples:' from ...
08:22:49.384 [ScanCode-1] ERROR org.ossreviewtoolkit.scanner.LocalScanner - Could not download 'Unmanaged::ScanCode-Samples:': DownloadException: Download failed for 'Unmanaged::ScanCode-Samples:'.
Suppressed: DownloadException: No VCS URL provided for 'Unmanaged::ScanCode-Samples:'.,
Suppressed: DownloadException: No source artifact URL provided for 'Unmanaged::ScanCode-Samples:'.
08:22:49.385 [ScanCode-1] INFO kotlinx.coroutines.CoroutineScope - Finished scanning Unmanaged::ScanCode-Samples: in thread 'ScanCode-1' (1/1).
08:22:49.388 [main] INFO org.ossreviewtoolkit.model.OrtResult - Computing excluded projects which may take a while...
08:22:49.390 [main] INFO org.ossreviewtoolkit.model.OrtResult - Computing excluded projects done.
Writing scan result to 'myOut\scan-result.yml'.
I'm trying to sum up the current status here: An OrtResult contains a Repository that in turn contains a VcsInfo. The latter cannot be set to anything meaningful if the analyzed directory is not under version control.
Instead of doing something hacky like setting it to VcsInfo.EMPTY
, an idea is to replace the current Repository
with something like a new AnalyzerInput
class with a Provenance instead of strictly VCS-related classes. Maybe also NestedProvenance could be generalized a bit so AnalyzerInput
could use it to also substitute Repository
's nestedRepositories
. When a directory that is not under version control is analyzed, the provenance would be set to UnknownProvenance
.
In that context maybe also RepositoryConfiguration
couold be renamed to something more general like ProductConfiguration
or so.
Easy reproducible (need have git) as simulate a fake monorepo:
mkdir test
cd test
git clone https://github.com/apple/swift-nio.git
git clone https://github.com/sw360/sw360python.git
ort analyze -i . -o output
I'm having a look at this refactoring. Let me know if you have any more input.
[...] Maybe also NestedProvenance could be generalized a bit so
AnalyzerInput
could use it to also substituteRepository
'snestedRepositories
. [...]
@sschuberth I noticed that NestedProvenance
is located inside org.ossreviewtoolkit.scanner.provenance
rather than org.ossreviewtoolkit.model
. From what I understand about the code so far however, most data structures, such as Provenance
and Repository
are located inside the model
.
I'm I correct in assuming that NestedProvenance
was only defined in the scanner, since it was only utilized there up until now and that it would generally make sense to move it into the model
? In the case of AnalyzerInput
, which should probably also be located in model
, it seems to cause a circular dependency between model
and scanner
, if we were to import NestedProvenance
inside the AnalyzerInput
.
Could moving the NestedProvenance
to model
be a good first step (pull request) in preparation for the AnalyzerInput
? Or am I missing something here?
I'm I correct in assuming that
NestedProvenance
was only defined in the scanner, since it was only utilized there up until now and that it would generally make sense to move it into themodel
?
Maybe not "generally", but in the context of this refactoring, yes, if we agree that this refactoring makes sense. I'd esp. like to hear @mnonnenmacher's opinion here.
Could moving the
NestedProvenance
tomodel
be a good first step (pull request) in preparation for theAnalyzerInput
?
See above. I'd like to first have a consensus among the core devs that this refactoring is the way to go.
@mnonnenmacher for an overview of changes, I opened a pull request https://github.com/oss-review-toolkit/ort/pull/8724.
During today's ORT community meeting, we discussed possible solutions for allowing non-vcs projects to be analyzed and scanned.
Our use case at HELLA would be to scan non-vcs projects, not just analyze them. This distinction had not been mentioned explicitly up until now.
In the light of that use case, @fviernau and @sschuberth advised to abandon the previously suggested course, of allowing UnknownProvance
as an input for the analyzer. Instead they put forward a new refactoring approach:
Repository
's VcsInfo
(and related variables) with KnownProvenance
, making it less dependent on VcsInfo
.LocalProvenance
as a new data class for KnownProvenance
, which contains a local directory path.This would allow the analyzer
and scanner
to handle non-vcs projects as a Provenance
as long as both steps are done on the same machine with the same directory structure.
PR #8724 will be dropped in favor of this new approach. I will post any updates or findings here.
Further input and discussion on this topic is welcome.
Hello, I sucessfully created a scan result with the following command. (I used the scancode-toolkit examples.)
cli\build\install\ort\bin\ort scan -p "C:\scancode-toolkit-3.1.1\samples" -o myOut
Output:If I look at the .yml file, it looks good and contains many licensed.
Now I try to generate any kind of report. My goal is to generate an attribution notice. So I run the command:
cli\build\install\ort\bin\ort report -i myOut\scan-result.yml -o myOutReport -f Excel
but the output shows the errorFor -f NoticeSummary, or -f NoticeByPackage OSS-RT seems to work at first glance:
But despite the many licenses in the .yml, the resulting report is empty, i.e. it says:
This project neither contains or depends on any third-party software components.
What is the problem, or how can this be fixed?
I attached my scan result for easy reference. scan-result.yml.txt