Feature request: Indicate position in code

LinqLover commented 3 years ago

For my use case, it would be very helpful to trace every import back to the place in a file where it was defined. This could be achieved by adding a field like Import.moduleSpecifier.position: {row, column} and storing these data during parsing.

TomerAberbach commented 3 years ago

I wouldn't be opposed to implementing something like that, but would want to understand your use case a bit first.

What would you be using it for? Wondering if having the start and end positions of every part (i.e. the import clause, the module specifier, etc.) rather than just the module specifier could be useful. Also wondering if having the string indices (in addition to the row and column) would be useful

TomerAberbach commented 3 years ago

I actually thought about it a bit more, and I think only string indices, rather than row-column, would make sense for parse-imports to support.

Thing is, strings are more complicated than we give them credit for. A "character" doesn't always display as one column wide on a screen. Sometimes it's multiple Unicode code points (like in the case of Chinese characters, letters with accents, emojis, etc.). Sounds complicated enough that this library probably shouldn't be responsible for it.

I would have expected for there to be mature libraries supporting this, but I tested 3 popular libraries and 1 less popular library and not a single a one is correct!

import linesAndColumns from 'lines-and-columns'
import lineColumn from 'line-column'
import findLineColumn from 'find-line-column'
import { lineCol } from 'line-column-mini'

const LinesAndColumns = linesAndColumns.default;

// 🦅 takes up two characters
const string = '🦅🦅!'
let indexOfExclamationPoint = 4

console.log(new LinesAndColumns(string).locationForIndex(indexOfExclamationPoint))
//=> { line: 0, column: 4 } (wrong)

console.log(lineColumn(string).fromIndex(4))
//=> { line: 1, col: 5 } (wrong)

console.log(findLineColumn(string, indexOfExclamationPoint))
//=> { line: 1, col: 4 } (wrong)

console.log(lineCol(string, indexOfExclamationPoint))
//=> { col: 5, line: 1 } (wrong)

In 1-based indexing the correct answer is { line: 1, column: 3 }. Maybe we should send a PR to one (or all) of them... :sweat_smile:

LinqLover commented 3 years ago

Thank you for your feedback! For my use case, only row/col start position is relevant. You can check out https://github.com/LinqLover/downstream-repository-mining/blob/a489715130f11dea88b1aea8f3d94a0f6115d7f4/src/references.ts#L178 for further information, but basically, I am collecting references to a package to allow users to explore the dependents of their package.

Yes, extracting the row/col calculation into another package would also be fine ... Then you can blame them for the Unicode errors. 😆

TomerAberbach commented 3 years ago

Gotcha! I can try to implement including the string indices soonish and then you'd be able to use one of these packages (or your own implementation) to convert the string index to line-column (the Unicode issue with these packages is unfortunate, but they'll still work for the majority of cases)

LinqLover commented 3 years ago

Woah, this would be great - even though I won't be able to use it before #3 is resolved.

TomerAberbach commented 3 years ago

Would you expect start/end index for module specifier to include the quotes? Meaning, for the following code:

import x from 'wow'

Should code.substring(startIndex, endIndex) give us 'wow' or wow. Not really sure which is better (it's also easy to convert between the two)

LinqLover commented 3 years ago

Ah, personally I would have expected code.substring(startIndex, endIndex) == "import x from 'wow'". But I don't have a strong opinion on that. :-)

TomerAberbach commented 3 years ago

Oh sorry, I wasn't clear. My plan is to add index ranges to multiple parts of the returned object. So the new type will look something like this:

type ModuleSpecifierType =
  | 'invalid'
  | 'absolute'
  | 'relative'
  | 'builtin'
  | 'package'
  | 'unknown'

type Identifier = {
  startIndex: number
  endIndex: number
  identifier: string
}

type Import = {
  startIndex: number
  endIndex: number
  isDynamicImport: boolean
  moduleSpecifier: {
    startIndex: number
    endIndex: number
    type: ModuleSpecifierType
    isConstant: boolean
    code: string
    value?: string
    resolved?: string
  }
  // Would be a breaking change for this object, but maybe it's worth it
  importClause?: {
    default?: Identifier
    named: Identifier[]
    namespace?: Identifier
  }
}

So the top-level startIndex and endIndex would return what you wrote, but I was trying to think of what would make sense for moduleSpecifier

donmccurdy commented 3 years ago

Wanted to add a +1 for this – my use case should (similarly) be solved by having start/end indices of the entire import statement. I'm writing a REPL (think something like JSFiddle) allowing users to write a script that imports from a set of pre-defined "packages". To execute the script in the browser I need to replace the imports with other code. For example:

// Before
import A, {B, C as D} from 'my-package';

// After
const A = dependencies['my-package'];
const {B, C: D} = dependencies['my-package'];

TomerAberbach commented 3 years ago

Thanks for the +1 and for including your use-case!

I ended up just including the start and end indices of the full statement/expression. If anyone else needs more granular indices, then I'll add them at that point

TomerAberbach commented 3 years ago

The version the start and end indices are available in is 1.1.0

LinqLover commented 3 years ago

Great work, thanks for the new feature! 🎉

TomerAberbach / parse-imports

Feature request: Indicate position in code #2