Ighina / DeepTiling

A TextTiling-based algorithm for text segmentation (aka topic segmentation) that uses neural sentence encoders, as well as extractive summarization and semantic search applications built on top of it.
MIT License
41 stars 5 forks source link

segeval error when running the fit.py command on the wiki_test_50 and the config_file parameters.json. #8

Closed yasminee99 closed 1 year ago

yasminee99 commented 1 year ago

'Reference and hypothesis segmentations differ in position length ({0} is not {1}).'.format(len(reference), len(hypothesis))) segeval.util.SegmentationMetricError: Reference and hypothesis segmentations differ in position length (31 is not 28).

error in the two files deeptilingModels.py line 328 :
segeval.convert_nltk_to_masses(reference, boundary_symbol=boundary_symb)[:-1]) and another error in the file fit.py line 147: Pk.append(deeptiling.compute_Pk(boundaries = results[-1]['boundaries'], ground_truth = long_true_lab[:-1], window_size=None))

Ighina commented 1 year ago

Hi,

The bug has now been fixed. I somehow introduced a mistake in the compute_pk method of DeepTiling, but I amended it and run the programme on my side: everything looks good now.