ethereum / solidity

Solidity, the Smart Contract Programming Language
https://soliditylang.org
GNU General Public License v3.0
23.35k stars 5.77k forks source link

Multiple SPDX-License-Identifier in the same section of comments doesn't generate an error #10145

Closed abcoathup closed 3 years ago

abcoathup commented 4 years ago

Description

Multiple SPDX-License-Identifier in the same section of comments doesn't generate an error and compiles.

Environment

Steps to Reproduce

The following with two SPDX License Identifiers doesn't generate an error and compiles.

// SPDX-License-Identifier: MIT

// SPDX-License-Identifier: Something Else

pragma solidity ^0.7.0;

contract MyContract {
}

Whilst two SPDX-License-Identifiers separated by code generates the expected parser error browser/MyContract.sol: ParserError: Multiple SPDX license identifiers found in source file. Use "AND" or "OR" to combine multiple licenses. Please see https://spdx.org for more information.

// SPDX-License-Identifier: MIT

pragma solidity ^0.7.0;

contract MyContract {
}

// SPDX-License-Identifier: Something Else
sambacha commented 4 years ago

SPDX License Registry Validation

The compiler doesn't validate the legitimacy of a license

// SPDX-License-Identifier: NULL /// SPDX-License-Identifier: 0

are valid but

/// SPDX-License-Identifier: ⡉⡊⡋⡌⡍⡎⡏⡐⡑⡒ // SPDX-License-Identifier: ởỠỡỢợỤ are seen as invalid

Multiple Identifiers

As SPDX is used to generate a Bill of Materials for a software system, multiple licenses can co-exist, see https://spdx.github.io/spdx-spec/3-package-information/#314-all-licenses-information-from-files

As for the last part about the parser error, 💯

cameel commented 4 years ago

@abcoathup Yeah, looks like a bug. Only the first line takes effect and the second one is completely ignored. Only MIT ends up in the metadata:

solc /tmp/test.sol --combined-json metadata  | jq
{
  "contracts": {
    "/tmp/test.sol:MyContract": {
      "metadata": "{\"compiler\":{\"version\":\"0.7.4+commit.3f05b770\"},\"language\":\"Solidity\",\"output\":{\"abi\":[],\"devdoc\":{\"kind\":\"dev\",\"methods\":{},\"version\":1},\"userdoc\":{\"kind\":\"user\",\"methods\":{},\"version\":1}},\"settings\":{\"compilationTarget\":{\"/tmp/test.sol\":\"MyContract\"},\"evmVersion\":\"istanbul\",\"libraries\":{},\"metadata\":{\"bytecodeHash\":\"ipfs\"},\"optimizer\":{\"enabled\":false,\"runs\":200},\"remappings\":[]},\"sources\":{\"/tmp/test.sol\":{\"keccak256\":\"0x769732361da909f77e4cb200f8f5920a3e6633d6947dde0b146e8235e0674594\",\"license\":\"MIT\",\"urls\":[\"bzz-raw://b37a2425fc9f4d17d62cedcdf500a4b5b88fe808efc6864a5f71abbe10ff8cb2\",\"dweb:/ipfs/Qmc1BVFCLFDZJvGNteAxup3t25TuKZYmoMyS8Upw37CNos\"]}},\"version\":1}"
    }
  },
  "version": "0.7.4+commit.3f05b770.Linux.g++"
}

@sambacha

// SPDX-License-Identifier: NULL /// SPDX-License-Identifier: 0 are valid

Lines starting with /// are not recognized as specifying the license so it would not cause an error even without this bug. They must be ordinary comments, not NatSpec.

/// SPDX-License-Identifier: ⡉⡊⡋⡌⡍⡎⡏⡐⡑⡒ // SPDX-License-Identifier: ởỠỡỢợỤ are seen as invalid

Currently the compiler simply ignores any comment where the content does not match this regex: SPDX-License-Identifier:\\s*([a-zA-Z0-9 ()+.-]+). Maybe issuing a more specific warning would be a small enhancement but at least it's not silently ignored. User does get a warning that makes it clear that the compiler does not see the license declaration (even if it might not be clear why).

The compiler doesn't validate the legitimacy of a license

As stated in Layout of a Solidity Source File > SPDX License Identifier, that's intentional.

Multiple Identifiers

Separate SPDX declarations can coexist but not within a single file. If you put your contract in a separate file, you can apply a different license to it. Otherwise you should combine the licenses with AND or OR to make it clear how they apply to the file.

sambacha commented 4 years ago

@abcoathup Yeah, looks like a bug. Only the first line takes effect and the second one is completely ignored. Only MIT ends up in the metadata:

solc /tmp/test.sol --combined-json metadata  | jq
{
  "contracts": {
    "/tmp/test.sol:MyContract": {
      "metadata": "{\"compiler\":{\"version\":\"0.7.4+commit.3f05b770\"},\"language\":\"Solidity\",\"output\":{\"abi\":[],\"devdoc\":{\"kind\":\"dev\",\"methods\":{},\"version\":1},\"userdoc\":{\"kind\":\"user\",\"methods\":{},\"version\":1}},\"settings\":{\"compilationTarget\":{\"/tmp/test.sol\":\"MyContract\"},\"evmVersion\":\"istanbul\",\"libraries\":{},\"metadata\":{\"bytecodeHash\":\"ipfs\"},\"optimizer\":{\"enabled\":false,\"runs\":200},\"remappings\":[]},\"sources\":{\"/tmp/test.sol\":{\"keccak256\":\"0x769732361da909f77e4cb200f8f5920a3e6633d6947dde0b146e8235e0674594\",\"license\":\"MIT\",\"urls\":[\"bzz-raw://b37a2425fc9f4d17d62cedcdf500a4b5b88fe808efc6864a5f71abbe10ff8cb2\",\"dweb:/ipfs/Qmc1BVFCLFDZJvGNteAxup3t25TuKZYmoMyS8Upw37CNos\"]}},\"version\":1}"
    }
  },
  "version": "0.7.4+commit.3f05b770.Linux.g++"
}

@sambacha

// SPDX-License-Identifier: NULL /// SPDX-License-Identifier: 0 are valid

Lines starting with /// are not recognized as specifying the license so it would not cause an error even without this bug. They must be ordinary comments, not NatSpec.

/// SPDX-License-Identifier: ⡉⡊⡋⡌⡍⡎⡏⡐⡑⡒ // SPDX-License-Identifier: ởỠỡỢợỤ are seen as invalid

Currently the compiler simply ignores any comment where the content does not match this regex: SPDX-License-Identifier:\\s*([a-zA-Z0-9 ()+.-]+). Maybe issuing a more specific warning would be a small enhancement but at least it's not silently ignored. User does get a warning that makes it clear that the compiler does not see the license declaration (even if it might not be clear why).

The compiler doesn't validate the legitimacy of a license

As stated in Layout of a Solidity Source File > SPDX License Identifier, that's intentional.

Multiple Identifiers

Separate SPDX declarations can coexist but not within a single file. If you put your contract in a separate file, you can apply a different license to it. Otherwise you should combine the licenses with AND or OR to make it clear how they apply to the file.

Thanks @cameel for the clarification, and it makes sense to only validate the presence of SPDX-License-Identifier as opposed to querying the entire list of recognized licenses at SPDX

As for NatSpec, my usage is purely for inventory purposes: makes it easier to automate finding which files do or do not have the proper licenses, especially when creating a DApp that may include contracts that have different license requirements, (e.g. GPL-2.0-Only and GPL-3.0)

e.g.

SPDX="^/// SPDX-License-Identifier:\.$"

touch spdx_header
cat <<EOF >spdx_header
/// SPDX-License-Identifier: ISC
EOF

# CHANGE VALUES AFTER `egrep` FOR FILE EXTENSIONS 
for f in $(find . -type f | egrep '\.(sol)$'); do
    HEADER=$(head -16 $f)
    if [[ $HEADER =~ $SPDX ]]; then
        BODY=$(tail -n +17 $f)
        cat license_header > temp
        echo "$BODY" >> temp
        mv temp $f
    else
        echo -ne "$f was missing header \033[32m FIXED"
        cat spdx_header $f > temp
        mv temp $f
    fi
done

Makes it easier for me to keep track what licenses apply to what aspects of the application. Would be helpful if a separate json of included SPDX licenses could be generated as a bill of materials using the spdx list's definedBy

Thanks,

Sam

axic commented 3 years ago

Currently the compiler simply ignores any comment where the content does not match this regex: SPDX-License-Identifier:\s*([a-zA-Z0-9 ()+.-]+).

Maybe we should strengthen this regexp to SPDX-License-Identifier: ([^\n]) (e.g. matching til end of line).

Or issue a warning if a line with // SPDX-License-Identifer: was found, but ignored?

axic commented 3 years ago

@chriseth any comment?

chriseth commented 3 years ago

@axic I think we should search with the more relaxed regex but then check the license candidate with the stricter regex and warn i it does not match.

axic commented 3 years ago

That makes sense. I did the first step (matching more SPDX lines) after #10653, but did not do the second step yet.

TerranCivilian commented 3 years ago

@axic Can you elaborate a tiny bit on what the "first" and "second" steps are? If you can remember this far back..

I'm hoping to complete this if there is still work on this issue.

chriseth commented 3 years ago

@TerranCivilian The relevant function is Parser::findLicenseString in libsolidity/parsing/Parser.cpp. I think the two steps are those from https://github.com/ethereum/solidity/issues/10145#issuecomment-753838995 and I think the stricter regex is the one from https://github.com/ethereum/solidity/issues/10145#issuecomment-743386097 - i.e. SPDX-License-Identifier: ([^\n])

axic commented 3 years ago

I actually have this work semi-done during the current bug-fixing week, in case @TerranCivilian you haven't started on it?

TerranCivilian commented 3 years ago

I haven't started it, I'm still getting my env setup :-)

axic commented 3 years ago

We're assigning bugs this week which are taken, but since was "still" assigned I haven't moved it. Sorry for the confusion.

Do you want to pick up something else perhaps?

TerranCivilian commented 3 years ago

Yea, it's no problem!