ethereum / solidity

Solidity, the Smart Contract Programming Language
https://soliditylang.org
GNU General Public License v3.0
23.29k stars 5.77k forks source link

Import path normalization (path spec v3) #11411

Closed cameel closed 1 year ago

cameel commented 3 years ago

Closes #5146. Related to #2738. Part of a set of issues that replaces #11138.

Abstract

Compiler's virtual filesystem currently makes very little assumptions about source unit names. Nearly anything is allowed and different names that would refer to the same file in the actual filesystem are considered different. This is not intuitive to users who mostly expect import paths to work like paths in their filesystem.

This proposal introduces some preprocessing for import paths to better match user expectations while still preserving the rule that different keys in the VFS always represent different source units. It also forbids non-normalized source unit names in the VFS because it will no longer be possible to have imports that match them.

The new rules do not apply to import paths that look like URLs since these may have wildly different semantics depending on the protocol.

Motivation

Same file being compiled multiple times

One of the problems caused by allowing non-normalized paths is the compiler loading and compiling the same file multiple times. At best it just wastes resources. At worst it can lead to hard to understand errors caused by the compiler seeing duplicate definitions.

Unintuitive results of normalization in corner cases

Another reason is getting rid of the weird corner cases related to relative imports. Relative imports (i.e. imports starting with ./ or ../) are treated in a special way by the compiler. Unlike direct imports, relative imports undergo some minimal processing. The compiler takes the source unit name of the importing file, strips the number of path segments corresponding to the number of leading ../ segments in the import path, normalizes the import path and then combines the two. The source unit name of the importing file is not normalized. This ensures that relative imports work properly when the importing file is identified with a URL.

For example, assuming that you have a file with source unit name of https://example.com/contract.sol, containing:

import "./token.sol"; // source unit name: https://example.com/token.sol

If the parent source unit name were to be normalized, the name would become https:/example.com/token.sol instead, which is not a valid URL.

On the other hand, this leads to surprising results in corner cases. For example ../ segments are treated as normal path segments and get canceled by ../ segments in the import path.

/project/./lib/contract.sol:

import "../util.sol";       // source unit name: /project/./util.sol
import "../../util.sol";    // source unit name: /project/util.sol
import "../../../util.sol"; // source unit name: /util.sol

URLs

The reason behind exempting URLs from this is that in some schemes ./ and ../ sequences might not represent path segments. We do not want to forbid such URLs but we also do not want to hard-code rules for each particular protocol into the compiler.

For this reason even after this change it will still be possible to run into the unintuitive normalization mentioned above with e.g. file:// or https:// URLs. file:// urls are not really recommended though and with https:// it's up to the server to interpret ../ - if it happens to be a part of the URL, it might not really represent going up in the directory hierarachy so the proposed normalization would not necessarily be desirable for such URLs anyway.

Specification

When converting an import path that does not have a protocol:// prefix (where protocol is an arbitrary URL protocol) into a source unit name:

In the virtual filesystem disallow source unit names that do not have a protocol:// prefix and contain:

Backwards Compatibility

The change is not fully backwards-compatible:

chriseth commented 3 years ago

To be honest, I would prefer the simpler solution where it is just an error to have a non-leading . or .. component inside an import path.

cameel commented 3 years ago

I'd be fine with that too but what about the other points?

cameel commented 3 years ago

We decided to put off work on this for now. We want to deal with #11408, #11409 and #11410 first and go back to this once that's done and working.

github-actions[bot] commented 1 year ago

This issue has been marked as stale due to inactivity for the last 90 days. It will be automatically closed in 7 days.

github-actions[bot] commented 1 year ago

Hi everyone! This issue has been automatically closed due to inactivity. If you think this issue is still relevant in the latest Solidity version and you have something to contribute, feel free to reopen. However, unless the issue is a concrete proposal that can be implemented, we recommend starting a language discussion on the forum instead.