Open juneMJ opened 2 years ago
Hi @juneMJ
The coherence measures actually are defined like below:
https://github.com/bab2min/tomotopy/blob/d30964ce0610a5e34d3645cfc8c26d99536cac03/tomotopy/coherence.py#L62-L67
The second value is the default size of sliding windows. If you don't provide the window_size
argument for coherence.Coherence()
, the above default values are used. To find the best window_size, you should do some experiments to evaluate how well each coherence score with a specific window_size actually matches human's evaluation. But this is costly, so it is recommended to use the default values suggested in several papers.
I think, it is enough to use the preset ('u_mass', 'c_uci', 'c_npmi'
) rather the specific combinations. The 'c_v'
isn't not recommended since it has some issues(#121, #126).
And for the PAModel, it seems to have bug at implementation of Coherence module. I'll check more on this.
Thank you @bab2min for the clarifications!
Hello, I'm trying several models with different coherence measures, but I have some questions I need to understand.
SLIDING_WINDOWS
fixed? or can I change it withing a range so I can compare which size is the best?DOCUMENT
orSLIDING_WINDOWS
?Thank you very much.