bugfix: Cleaning up floating commas after knip --fix

glemiron commented 2 months ago

I would like to fix this issue: https://github.com/webpro-nl/knip/issues/614.

This problem doesn't allows me to apply knip --fix on scale, because extra commas are breaking typescript check syntax and requires additional work to fix.

I don't sure that is optimal, but I decided to open this PR to start a discussion.

netlify[bot] commented 2 months ago

Deploy request for knip pending review.

Visit the deploys page to approve it

Name	Link
Latest commit	b4bd611c9c4318fb7e7713bfa01a1f9360359156

webpro commented 2 months ago

Hey @glemiron, thanks for bringing this up and starting the PR! I'd absolutely love a better solution here.

The solution with regexes to clean up after the fact was already a bit convoluted, and adding more won't make it much better. The new regexes don't include export anymore so might remove things too broadly.

I have an idea: how about starting at text.substring(end) and look for the next characters being either whitespace or a comma, and if so, also strip that from text. So fix that up directly in the reducer. What do you think?

glemiron commented 2 months ago

It seems there's no way to solve it without a regexp, because we don't have enough information inside reduce to be able safely decide if removal of commas is needed.

Our primary targets for this operation are export {....}, module.exports = { ... } and export { Row as KRow, Row } from './reexported'. In the same time, exports with square brackets must be avoided, because removal commas from them will be unsafe.

Example:

// before
export const [c, d] = [1, 2];

// after c removed
export const [d] = [1, 2];

// expected; comma makes a difference! 
export const [,d] = [1, 2];

I see the following alternative approach to solve this problem within the reduce function:

I see a following alternative way to solve this problem inside the reduce:

Use a regular expression to find all export positions where a comma fix needs to be applied.
For each removed part, check if it belongs to an export and, if so, remove exactly one comma after (before for the last) the variable being removed.

It’s now clear for me that the removal of commas should only be applied within the export context. I'll consider if there's a way to simplify this logic further.

Happy to hear more ideas, thanks for the prompt reply!

webpro commented 2 months ago

It seems there's no way to solve it without a regexp, because we don't have enough information inside reduce to be able safely decide if removal of commas is needed.

We have the full text contents of the source file available. Let's write out the function with some psuedo code/high-level ideas:

const sourceFileText = exportPositions.reduce(
  (text, [start, end]) => {
    const before = text.substring(0, start); 
    const subject = text.substring(start, end); 
    const after = text.substring(end);

    // if subject is e.g. "export " we can return result
    return before + after;

   // otherwise, we could look to the left or right until `{` or `[`
   // if only e.g. `export { }` would remain, we could remove all this
   return before? + after?
  },
  await readFile(filePath, 'utf-8')
);

Inside the reducer, we have the full text, so we can "look around" and see if we're inside square or curly brackets, etc. I think this "inside-out" approach is safer than applying regular expressions "after the fact". We've done the hard work during AST traversal so we can assume we're in the right position to remove things.

Would you agree and like to give this a stab?

glemiron commented 2 months ago

I wasn't able to make this assumption without full context:

We've done the hard work during AST traversal so we can assume we're in the right position to remove things.

Now I have everything to solve it!

webpro commented 2 months ago

Ah, great! One example of an AST visitors is https://github.com/webpro-nl/knip/blob/main/packages/knip/src/typescript/visitors/exports/exportDeclaration.ts and most of them should have // Pattern comments to show what the matching source code looks like. The returned fix value has the [start, end] tuple.

glemiron commented 2 months ago

@webpro what do you think about fixing this issue on the exportDeclaration level? It'll be much easier.

Imagine only one element left inside an import, then we'll have one child inside node.exportClause.elements and we need to return a tuple for whole node for removal instead of a single element. Similar with commas, we have all information there to decide what must go.

With this approach IssueFixer.ts will be responsible only for cutting right piece of text and exportDeclaration will be fully responsible for witch part must be removed.

Draft solution:

export default visit(
  () => true,
  (node, { isFixExports, isFixTypes }) => {
    if (ts.isExportDeclaration(node)) {
      if (node.exportClause && ts.isNamedExports(node.exportClause)) {
        // Patterns:
        // export { identifier, identifier2 };
        // export type { Identifier, Identifier2 };
        const type = node.isTypeOnly ? SymbolType.TYPE : SymbolType.UNKNOWN;
        const sourceFile: BoundSourceFile = node.getSourceFile();
        const declarations = sourceFile.getNamedDeclarations?.();
        return node.exportClause.elements.map((element, index, arr) => {
          const identifier = String(element.name.text);
          const propName = element.propertyName?.text;
          // @ts-expect-error TODO Fix (convenience in addExport)
          // const symbol = element.symbol ?? declarations?.get(identifier)?.find((d: ts.Node) => d !== element)?.symbol;
          const symbol = declarations?.get(propName ?? identifier)?.[0]?.symbol;
          const pos = element.name.pos;

          let fix: Fix = undefined
          if (isFixExports || isFixTypes) {
            if (arr.length === 1) {
              fix= [node.getStart(), node.getEnd()]
            } else if (index === arr.length - 1) {
              // TODO find a comma before
              fix = [element.getStart(), element.getEnd()]
            } else 
            // TODO find a comma after
            fix = [element.getStart(), element.getEnd()]
          }

          return { node: element, symbol, identifier, type, pos, fix };
        });
      }
    }
  }
);

UPD: It doesn't work, because on visitor stage we have no idea what should be removed. I think issueFixer should know about AST

webpro commented 1 month ago

The result of this first phase (of which AST traversal is an important part) is a module graph. It's serializable and it won't contain AST tree or nodes we could, indeed, use to more easily delete unused exports. It would be (too) expensive to recreate it for the "fixer phase".

The module graph is serializable which has a few advantages:

It can be cached and stored on disk
In large codebases/monorepos multiple "programs" becomes heavy/slow
- https://knip.dev/reference/faq#why-doesnt-knip-just-use-tsfindreferences
- https://knip.dev/blog/slim-down-to-speed-up
The module graph as a data structure can be used for other features and perhaps even by other libraries, without relying on TypeScript.

However, we might still be able to have some wins when we're still working with the AST. One idea that's perhaps interesting to include the separator (comma) in the end pos during AST traversal if the tree/node contains this information (i.e. if it's easier/faster compared to doing this in the reducer in the "fixer phase").

In situations with a single item like export { oneValue } we could store start and end of the whole declaration, but for more than one item we would need the same logic anyway to check for commas and empty export {} declaration, so I don't think this helps a lot.

That said, the current fix: [start, end] tuple is a first idea I had, if there's more information we need to store to make our lives easier later on then that's certainly possible.

I should also add a different lib that might be interesting for you, but haven't tried myself yet: https://knip.dev/explanations/comparison-and-migration#ts-remove-unused, no idea if it could work in your situation.

Let me know what you think!

webpro commented 1 month ago

@glemiron I couldn't resist.. Sorry, not trying to dismiss your work here, but your PR just got me excited to work on it!

This might be step forward. My first attempt is in this branch/commit: https://github.com/webpro-nl/knip/commit/c3248612279eb26d5291f96c114c9da3c468d70e

The parser/cleaner is in a separate function so it's easier to test and work with.

Installation:

npm i -D https://pkg.pr.new/knip@c324861

Would love to hear what you think and how it behaves on your project. Feel free to try it out and provide feedback if you have any. Let's make it happen :)

glemiron commented 1 month ago

Nice one! I don't mind to have your solution if it solves the problem.

I like your approach due to its minimalism, I didn't even think to pass something as 3 argument of the pos tuple.

I've drafted a solution from a bit different perspective, I've pushed unfinished change, so you can check the idea. A while ago I was solving a similar problem and instead of deleting anything, it's possible to construct import from scratch.

It's fun problem to solve and a good example that one problem has a lot of different solutions!

webpro commented 1 month ago

That's awesome, tonight I'll some more time and for sure I'll study your solution too. My solution is def not ready, let's combine best of both worlds :)

glemiron commented 1 month ago

I can make mine work tonight and use your approach with 3 element for tuple to pass extra information.

Looking forward for your opinion on my approach!

glemiron commented 1 month ago

It's now obvious that idea to rebuild import content from elements isn't viable. It'll be hard to keep formatting, because imports could be separated not only with spaces, but lines breaks as well.   

Now, it’s clear that we should do more heavy lifting on the fixerSide. However, we could simplify the logic a bit by passing information about boundaries of export elements “{}” to the fixerSide, then the search for a nearest comma could be simplified.

Let me know if I can help with finalising your change. Thanks for your support, I’ve learned a lot about the library! I hope I can use this knowledge and contribute to another features.

webpro commented 1 month ago

It's now obvious that idea to rebuild import content from elements isn't viable. It'll be hard to keep formatting, because imports could be separated not only with spaces, but lines breaks as well.   

Now, it’s clear that we should do more heavy lifting on the fixerSide. However, we could simplify the logic a bit by passing information about boundaries of export elements “{}” to the fixerSide, then the search for a nearest comma could be simplified.

Whatever we'll end up with, your work and the conversation have been super useful. Happy to have explored multiple angles of problem and solution.

Let me know if I can help with finalising your change. Thanks for your support, I’ve learned a lot about the library! I hope I can use this knowledge and contribute to another features.

Here's a new version that also supports type { A } and { type A }. The implementation is still ugly but it does the job. Still needs a lot of testing. Would be great if you could try it on your project(s) too:

npm i https://pkg.pr.new/knip@6b7e30d

Feel free to hack on my branch and apply your ideas if you'd like. I won't be able to work on this for at least another day.

glemiron commented 1 month ago

Applied to my project and found an edge-case:

// initial: 
export const {
  get: someName,
  set: someSetName,
} = getHelpers()

// after the applied fix:
export const {
  ,
  set: someSetName,
} = getHelpers()

someName was also removed by misstake, because it's definitely used inside the file, but it's an another bug.

Looks very promising indeed! I'll able to work on this a little more only one the weekend, so I don't mind if you finish the feature without my help. I'll be happy to help with testing!

webpro commented 1 month ago

Applied to my project and found an edge-case

Thanks for reporting this! Should be fixed now.

Looks very promising indeed! I'll able to work on this a little more only one the weekend, so I don't mind if you finish the feature without my help. I'll be happy to help with testing!

Thanks a lot, this is much fun :) Update:

Fixed: handling of commas
Added: removal of enum and class members

npm i https://pkg.pr.new/knip@e788811

glemiron commented 1 month ago

The new version is tested and fully works for me.

Thank you once again!

webpro commented 1 month ago

:rocket: This issue has been resolved in v5.31.0. See Release 5.31.0 for release notes.

Using Knip in a commercial project? Please consider becoming a sponsor.

webpro commented 1 month ago

For the record, with regards to the automated message, the changeset of this PR is not actually in this release. HOWEVER, thank you so much for initiating this @glemiron because we've managed to improve Knip a lot together here.

webpro-nl / knip