xruben136x / SZZ_unisannio

Implementation of SZZ Algorithm
Mozilla Public License 2.0
4 stars 0 forks source link

Extracting Commit Hash, Second Number after Commit Hash, and Corresponding Authors from Blame Result in Python Using Regular Expressions #3

Closed xruben136x closed 1 year ago

xruben136x commented 1 year ago

Question: dato un output dell'operazione blame_result = repo.git.blame(parent_commit, "third_party/xla/xla/service/gpu/gpu_copy_insertion_test.cc", "--line-porcelain") :

cf1b0378f428fedb5194083f6ba9aa708388a58d 1 1 17
author Adrian Kuegel
author-mail <akuegel@google.com>
author-time 1684162708
author-tz -0700
committer TensorFlower Gardener
committer-mail <gardener@tensorflow.org>
committer-time 1684162954
committer-tz -0700
summary Determine when fusion can share buffers between operands and outputs.
filename tensorflow/compiler/xla/service/gpu/gpu_copy_insertion_test.cc
 /* Copyright 2023 The TensorFlow Authors. All Rights Reserved.
cf1b0378f428fedb5194083f6ba9aa708388a58d 2 2
author Adrian Kuegel
author-mail <akuegel@google.com>
author-time 1684162708
author-tz -0700
committer TensorFlower Gardener
committer-mail <gardener@tensorflow.org>
committer-time 1684162954
committer-tz -0700
summary Determine when fusion can share buffers between operands and outputs.
filename tensorflow/compiler/xla/service/gpu/gpu_copy_insertion_test.cc

Answer: Per estrarre i commit hash, il secondo numero dopo i commit hash e gli autori corrispondenti dall'output fornito, puoi utilizzare espressioni regolari in Python. Ecco un esempio di codice che potresti utilizzare:

import re

output = """
cf1b0378f428fedb5194083f6ba9aa708388a58d 1 1 17
author Adrian Kuegel
author-mail <akuegel@google.com>
author-time 1684162708
author-tz -0700
committer TensorFlower Gardener
committer-mail <gardener@tensorflow.org>
committer-time 1684162954
committer-tz -0700
summary Determine when fusion can share buffers between operands and outputs.
filename tensorflow/compiler/xla/service/gpu/gpu_copy_insertion_test.cc
cf1b0378f428fedb5194083f6ba9aa708388a58d 2 2
author Adrian Kuegel
author-mail <akuegel@google.com>
author-time 1684162708
author-tz -0700
committer TensorFlower Gardener
committer-mail <gardener@tensorflow.org>
committer-time 1684162954
committer-tz -0700
summary Determine when fusion can share buffers between operands and outputs.
filename tensorflow/compiler/xla/service/gpu/gpu_copy_insertion_test.cc
"""

# Definisci il pattern delle espressioni regolari
pattern = re.compile(r'([a-f0-9]+)\s+(\d+)\s+\d+\s+\d+\nauthor\s+([^\n]+)')

# Trova tutte le corrispondenze nel testo di output
matches = pattern.findall(output)

# Estrai le informazioni desiderate
commit_hashes = [match[0] for match in matches]
second_numbers = [match[1] for match in matches]
authors = [match[2] for match in matches]

# Stampare i risultati
for i in range(len(commit_hashes)):
    print(f"Commit Hash: {commit_hashes[i]}, Secondo Numero: {second_numbers[i]}, Autore: {authors[i]}")

Questo codice utilizza espressioni regolari per cercare i commit hash, il secondo numero dopo i commit hash e gli autori nel testo di output fornito. I risultati vengono quindi stampati per ogni corrispondenza trovata.
xruben136x commented 1 year ago

This piece of code needs to be integrated in order to build the function capable of extracting candidate commits.