IBM / data-prep-kit

Open source project for data preparation of LLM application builders
https://ibm.github.io/data-prep-kit/
Apache License 2.0
194 stars 114 forks source link

[Feature] Build a transform to remove dead code from code files #84

Open Bytes-Explorer opened 5 months ago

Bytes-Explorer commented 5 months ago

Search before asking

Component

Transforms/code/code_quality

Feature

Goal is to remove dead code from code files. The routine should work across 100+ programming languages and should be easily extensible to more.

Are you willing to submit a PR?

ykalathiya commented 3 months ago

There are various approaches that can be used to detect dead code effectively:

  1. Using Pipeline with Language-Specific Libraries:

Python: vulture JavaScript: esprima Java: javalang C/C++: libclang PHP: PHPStan

These libraries utilize Abstract Syntax Trees (AST) or code parsing techniques to identify unused variables, classes, or functions.

  1. Utilizing Machine Learning Models:

Another approach involves leveraging models like code-bert and fine-tuning them according to specific programming languages. This can be achieved through supervised learning techniques, adapting code-bert to effectively detect dead code across various codebases.