Closed caiqizh closed 6 months ago
Hey @caiqizh! We're happy to hear that you've been able to make use of our library!
Currently you can change the way these matrix are calculated only by directly changing the code in this module. We will work on improving customizability of implemented methods in the near future, so this will probably become much simpler soon.
As a temporary solution, you can use the following snippet, which uses SemanticMatrixCalculator to calculate these additional statistics. But we hope to improve the code for your usecase in the nearest future.
from lm_polygraph.estimators import Eccentricity
from lm_polygraph.stat_calculators import SemanticMatrixCalculator
# Put your 5-10 text samples from ChatGPT generation
samples = [
'The capital of France is Paris.',
'Paris is the capital city of France.',
'The capital of France is Paris.',
'In France, the capital is Paris.',
'The capital city of France is Paris.',
]
stats = {
'blackbox_sample_texts': [samples],
'deberta_batch_size': 10,
}
nli_calculator = SemanticMatrixCalculator()
stats.update(nli_calculator(stats, None, None, None))
# Now stats should contain 'semantic_matrix_entail' and 'semantic_matrix_contra'
estimator = Eccentricity()
uncertainty = estimator(stats)[0]
print(uncertainty) # 7.886122580038351e-05
Thank you for the previous answer! It looks like in the latest version this does not work anymore. Could you please reopen this issue? Thanks!
Thank you for providing the codes for the previously generated text! They have been very helpful, and I've successfully used them for Lexical Similarity analysis. I'm planning to test them for other measurements, including NumSets, Degree matrix (Deg), and Eccentricity.
I noticed that these measurements require two additional statistics:
semantic_matrix_entail
andsemantic_matrix_contra
. According to the original paper, I know that these are calculated using DeBERTa over generated samples. I'm wondering if there are any short code snippets available to compute these matrices and feed them into the estimator function.Thanks!