gjd6640 / sonar-text-plugin

A free and open-source plugin for SonarSource's Sonarqube product that lets you create rules to flag issues in text files.
Apache License 2.0
23 stars 4 forks source link

Duplication in text files with custom extension #8

Closed RobSmyth closed 6 years ago

RobSmyth commented 6 years ago

Hi,

I'm using this plugin and have setup a rule for the file extension *\.ABC. All good. But I also want to get a measure of duplication in all ABC files. Currently the duplication does not seem to look at ABC file. Is there a way?

FYI - I did also activate a text duplication rule. No luck.

Thanks

Rob

gjd6640 commented 6 years ago

Hi. Sorry for the delay in responding.

SonarQube's language plugins are each assigned a set of file extensions that they "own". To see the duplication metric you'll want to assign the ABC file extension to the plugin that measures duplication for that language. The text plugin WILL still be able to analyze files with an extension of ABC even when ABC is not a registered extension for sonar-text-plugin.

Specifically, try this: 1) Click "Administration", click "Sonar Text Plugin" on the left, and remove ABC from the list of extensions labelled: "File suffixes" 2) Find the appropriate language plugin to perform the duplication analysis on the left side of the screen, click on it, and add the ABC suffix to that plugin's configuration. 3) Rescan a project and see if it worked

RobSmyth commented 6 years ago

Hi,

Thank you. I see the file extensions for sonar-text-plug, but I do not see anywhere to assign extensions to be included for measuring duplication. In "Analysis Scope" I see exclusions but not inclusions. When you say "find the appropriate language plugin to perform the duplication" what do you mean? The files are text files that do not conform to any defined language.

The scanner is showing:

INFO: SCM provider for this project is: git INFO: 846 files to be analyzed INFO: 560/846 files analyzed INFO: 842/846 files analyzed INFO: 842/846 files analyzed INFO: 845/846 files analyzed WARN: Missing blame information for the following files: WARN: * sonar-project.properties WARN: This may lead to missing/broken features in SonarQube INFO : Calculating CPD for 0 files INFO: CPD calculation finished

Thanks

Rob

gjd6640 commented 6 years ago

I’ve just done a little reading about how CPD works and it depends upon a language plugin implementing duplication detection / finding duplicates (accurately) is based upon the semantics of each language.

If the files in question contain text-formatted data and not code then the definition of duplications within them is probably specific to your use-case. Consider looking for existing tools outside of the sonarqube ecosystem that find duplications within a set of text files in a way that accurately finds what you are looking for. Then just prior to invoking the sonar analyzer on the project’s codebase run that tool and write its answer to a file (text or xml work). You can then write a custom rule in the text or xml plugin that raises issues on lines in the “duplications report” file.

An assumption that I’ve made here is that the scope of the intended duplication search is between files found within the single project that is being scanned and not across many projects.

RobSmyth commented 6 years ago

Thanks ...much appreciated help.

Rob