Open xu1119 opened 3 years ago
@xu1119 Thank you: a great find.
FWIW, I usually run a scan with these options --license --license-text --license-text-diagnostics --json-pp -
when I want to validate some bug. So much so that I created an alias in my bash aliases:
alias sca='scancode --license --license-text --license-text-diagnostics --json-pp -'
This may help you.
That said, you found exactly the issue and the problematic rule. The way to resolve this would be to create a false positive rule this way:
src/licensedcode/data/rules/false-positive_not_python.RULE
file with this content:
python module
src/licensedcode/data/rules/false-positive_not_python.yml
data file with this content:
is_false_positive: yes
notes: Do not detect license python module as a Python license
Seen in https://github.com/keras-team/keras/blob/ad9268d67014273e35faac4ff21cbfe929bf1d2b/keras/utils/generic_utils.py
and reported in https://github.com/nexB/scancode-toolkit/issues/2377
(the notes are required, but they can be anything useful to qualify and explain the false positive: so here this is just a suggestion)
With these two files added, you can then run a scancode --reindex-licenses
to rebuild the license detection index with these two new rules. Then retry your scans: only an Apache license should be reported.
What this does is basically this: If the text python module
is detected alone and is not part of a larger detection, then this is a false positive that should be ignored.
Do you feel like you can submit a PR with this? We can handle it if you are not comfortable with doing it otherwise.
Another file sidecar_evaluator.py has text:
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Python utilities required by Keras."""
from __future__ import absolute_import
The text python utilities
may also add to false positive rule?
The text
python utilities
may also add to false positive rule?
@xu1119 correct, that would mean adding another rule for this.
Description
When trying to scan the project https://github.com/keras-team/keras with latest scancode, It get two false positives. Two files are: https://github.com/keras-team/keras/blob/master/keras/distribute/sidecar_evaluator.py and https://github.com/keras-team/keras/blob/master/keras/utils/generic_utils.py
"License" and "python" are there in consecutive lines, it is detected as python-2.0 license
and
Text for python_13.RULE is
license = "Python"
How To Reproduce
scancode -li --license-text --json-pp - generic_utils.py scancode -li --license-text --json-pp - sidecar_evaluator.py
System configuration