dataset？ - Githubissues

yueyuep commented 4 years ago

hello ! I had cloned yours code.and set correct configure. I run the UnfixedAlarmCollector.scala. It seems to need the "fixed-summary-project-vtype.csv" dataset.could you help me.

Kui-Liu commented 4 years ago

Hello Dongsun,

Could you address Peng's problem?

Thanks.

Best regards,

Kui

On Mon, 25 May 2020 at 19:38, PengLi notifications@github.com wrote:

hello ! I had cloned yours code.and set correct configure. I run the UnfixedAlarmCollector.scala. It seems to need the "fixed-summary-project-vtype.csv" dataset.could you help me.

`package edu.lu.uni.serval.alarm.tracking

import edu.lu.uni.serval.alarm.util.db.graph.neo4j.VioDBFacade import edu.lu.uni.serval.alarm.util.TrackingHelperUtils import org.neo4j.driver.v1.Value import java.io.File import com.github.tototoshi.csv.CSVReader import org.eclipse.jgit.treewalk.TreeWalk import org.apache.commons.io.FileUtils import scala.collection.mutable. import scala.collection.JavaConverters. import edu.lu.uni.serval.alarm.tracking.TrackingUtils._

object UnfixedAlarmCollector { def main(args: Array[String]): Unit = { doTask("fixed-summary-project-vtype.csv", args(0)) }

def doTask(fixSummaryPath: String, project: String): Unit = { val summaryFile = new File(fixSummaryPath) val summaryList = CSVReader.open(summaryFile).all()

// prepare project to pack map val prj2PackMap = TrackingHelperUtils.readProject2PackMap("/home/darkrsw/repo/false-alarm-study/prj-pack.map") println("prj2pack map size: " + prj2PackMap.size) val repoPathTemplate = "/mnt/archive1/data/violations/repos/repos-%s/%s/.git"

//val project = "Activiti-Activiti" val outFile = new File(s"unfixed-$project.csv")

val pack = prj2PackMap(project) val repoPath = repoPathTemplate.format(pack,project) val tracker = new AlarmTracker(project) tracker.initGitRepo(repoPath) val gitproxy = tracker.repoProxy.get

val perProject = summaryList.filter(entry => entry(0) == project)

if( perProject.size < 1 ) { println(s"No alarms in $project?") return }

// collect vtype of fixed alarms in the project val vtypeSet = Set[String]()
perProject.foreach( entry => vtypeSet += entry(1) )

VioDBFacade.init()

val results = VioDBFacade.session.run( s"""match (n:Violation {project: '$project'}) where NOT (n)-[:CHILD]->(:Violation) return n""") / val results = VioDBFacade.session.run( s"""match (n:Violation {id: 'Activiti-Activiti:a8e456784456f93a5c808d241b156cf79b0985ce:org.activiti.engine.impl.bpmn.behavior.IntermediateThrowCompensationEventActivityBehavior:BC_UNCONFIRMED_CAST:41:41'}) return n""") / val dlist = results.list().asScala.toList println("Neo4J query completed: "+dlist.size)

// reaarange commit and files val commitFileMap = Map[String, Map[String, Set[Value]]]()

var counter = 0 // collect commits and files. dlist.foreach(line => { try { val node = line.get("n")
val commitHash = node.get("commit").asString()
val resolution = if(node.get("resolution").isNull()) "" else node.get("resolution").asString()
val vtype = node.get("vtype").asString()

val infoTokens = node.get("id").asString().split(":")
      val packagePath = infoTokens(2)

      val mainClassPath = if(packagePath.contains("$"))
              {
                  val tokens2 = packagePath.split("\\$")
                  tokens2(0)
              }
              else packagePath

      val fileValueMap = if( commitFileMap.contains(commitHash))
      {
        commitFileMap(commitHash)
      }
      else
      {
          val newMap = Map[String, Set[Value]]()
          commitFileMap.put(commitHash,newMap)
          newMap
      }

  val valueSet = if( fileValueMap.contains(mainClassPath) )
  {
      fileValueMap(mainClassPath)
  }
  else
  {
      val newSet = Set[Value]()
      fileValueMap.put(mainClassPath, newSet)
      newSet
  }

if( resolution != "fixed" && vtypeSet.contains(vtype))
{
  counter+=1
  valueSet.add(node)
}
} catch {
  case e: Throwable => println("error: "+e.getMessage)
}
})

println("#Node after filtering: "+counter)

commitFileMap.foreach( entry => { val commitHash = entry._1 val fileValueMap = entry._2
println("Processing "+commitHash)

val commit = gitproxy.getCommitByHash(commitHash)

val treeWalk = new TreeWalk( gitproxy.getRepository() )
val tree = commit.getTree()
treeWalk.addTree(tree);
treeWalk.setRecursive(true);

while( treeWalk.next() ) {
  val path = treeWalk.getPathString()
  //println(path)
  if( path.toLowerCase().endsWith(".java") )
  {
    val source = getSourceText( commit, path, gitproxy )
    val thisPackagePath = parseAndExtractPackagePath( source )
    val className = takeFileName(path)
    val targetPath = thisPackagePath+"."+className

    if(fileValueMap.contains(targetPath))
    {
      val valueSet = fileValueMap(targetPath)

      valueSet.foreach( node => {
          val vtype = node.get("vtype").asString()
            val commitHash = node.get("commit").asString()
            val sLine = node.get("sLine").asInt()
            val eLine = node.get("eLine").asInt()
            val resolution = if(node.get("resolution").isNull()) "" else node.get("resolution").asString()

          FileUtils.write(outFile, s"$vtype,$project,$commitHash,$path,$sLine,$eLine\n", "UTF-8", true)
      })
    }
  }
}

treeWalk.close();
})

println("Original Neo4J query completed: "+dlist.size) println("#Node after filtering: "+counter) VioDBFacade.close()

} }`

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/FixPattern/findbugs-violations/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEE6IKXRYIUTOI4KXRVVUMTRTJKEXANCNFSM4NJORB4A .

darkrsw commented 4 years ago

Hi @yueyuep,

Basically, you can generate the file fixed-summary-project-vtype.csv from your dataset as it depends on the data you are working on. It is why I did not include the file when I share the code.

Anyhow, you can find my version of the file here.

Regards, Dongsun.

yueyuep commented 4 years ago

hi,thanks for your help. i had saw the dataset you share.it incluede three columns,while i dont' know the means of each column . Is there code to generate the fixed-summary-project-vtype.csv .

According you readme. i need to run following three commands: when i run the first one.it need thefixed-summary-project-vtype.csv.also need prj-pack.map Also,the UnfixedAlarmCollector.scala seems depend on the Neo4j database. it need to quere some result from this database.

darkrsw commented 4 years ago

@yueyuep fixed-summary-project-vtype.csv has three columns and they are project, violation type, # of violations, respectively. This file is just for statistics. You can simply create this file from several projects.

prj-pack.map is even simpler and you might not need this file. It looks like:

ckarthickit-HelloWorld:repo-a
jetty-project-codehaus-jetty6:repo-a
matsim-org-matsim:repo-a
apache-ofbiz:repo-a
...

I had 2000+ projects at the beginning and I had to put them in different directories just for convenience. So, you can ignore the file; Just think of a list of projects.

Neo4j was just necessary to track of identical violations; You need to implement something like the technique shown in the following paper.

P. Avgustinov et al., “Tracking Static Analysis Violations over Time to Capture Developer Characteristics,” in Proceedings of the 37th International Conference on Software Engineering - Volume 1, Piscataway, NJ, USA, 2015, pp. 437–447, Accessed: Oct. 30, 2015. [Online]. Available: http://dl.acm.org/citation.cfm?id=2818754.2818809.

Regards, Dongsun.

yueyuep commented 4 years ago

thanks,it help me a lot. in the violation-collection,you use findbugs to analyse the code and collect bugs report. before analysing,you should configure the findbugs. the AlarmExpExecutor.scala need a conf file , i don't konw the format of this conf file .could you share the file exampe.

darkrsw commented 4 years ago

@yueyuep You can find AutoExpConfShellWriter.scala in violation-collection/src/main/scala/edu/lu/uni/serval/alarm/util.

You need to create a conf file for each project you work on. As the file (i.e., AutoExpConfShellWriter.scala) has many hard-coded strings, you have to adapt many of the source code for your subjects.

Cheers, Dongsun.

kimgimkigi commented 2 years ago

@darkrsw

1.

prj-pack.map is even simpler and you might not need this file.

Is there a specific reason that prj-pack.map is not needed? In your code, prj-pack.map is used in other lines. If I need prj-pack.map, then is it correct to just add ":repo-a" after project name like "jenkinsci-acceptance-test-harness:repo-a"?

2. In AutoExpConfShellWriter.scala, It looks needs findbugs-exclude-filter.xml . I cannot find findbugs-exclude-filter.xml from your repositories. Could you please share findbugs-exclude-filter.xml ?

3. In AutoExpConfShellWriter.scala, I have to set tmpDir , javaRT, libDir. What is the expected path of tmpDir and libDir? I cannot expect what it is.

It seems I need more questions for setup. Thanks for your all of kind answers. Thanks!

darkrsw commented 2 years ago

@kimgimkigi

prj-pack.map is even simpler and you might not need this file.

Is there a specific reason that prj-pack.map is not needed? In your code, prj-pack.map is used in other lines. If I need prj-pack.map, then is it correct to just add ":repo-a" after project name like "jenkinsci-acceptance-test-harness:repo-a"?

The file is just for batch processing of multiple projects. Back in the day when I was running the experiment, there were too many projects to process. Thus, I have divided the projects into packs. When running the experiment on a few projects, it is not necessary.

In AutoExpConfShellWriter.scala, It looks needs findbugs-exclude-filter.xml . I cannot find findbugs-exclude-filter.xml from your repositories. Could you please share findbugs-exclude-filter.xml ?

It is optional. Some projects specify don't care warnings. findbugs-exclude-filter.xml is a default file name for the don't care warnings. If the project has no findbugs-exclude-filter.xml, you can just ignore it.

In AutoExpConfShellWriter.scala, I have to set tmpDir , javaRT, libDir. What is the expected path of tmpDir and libDir? I cannot expect what it is.

tmpDir is literally a temporary directory for some intermediate files. libDir is for .jar files, which are dependencies of the target project. You can copy all dependencies into the directory.

It seems I need more questions for setup. Thanks for your all of kind answers. Thanks!

Thanks for your interest in our work again!

TruX-DTF / findbugs-violations

dataset？ #5