Using the preloaded example sentences in the ADIDA interface, for instance:
"بدي دوب قلي قلي بجنون بحبك انا مجنون ما بنسى حبك يوم"
I get a score of 95.9% for Beirut
When I try to predict the same sentence using camel_tools, I get a different result. For example, using model26 which I assume is the same as in ADIDA
from camel_tools.dialectid import DIDModel26
did = DIDModel26.pretrained()
did.predict(['بدي دوب قلي قلي بجنون بحبك انا مجنون ما بنسى حبك يوم'])
I get the following scores: [DIDPred(top='ALE', scores={'ALE': 0.2744463749182225, 'ALG': 0.0019964477414507772, 'ALX': 0.0017124356871910278, 'AMM': 0.04793813798943018, ...
Similarly using model6, I also get different and lower scores than the online interface (but at least dialect is correct).
from camel_tools.dialectid import DIDModel6
did = DIDModel6.pretrained()
did.predict(['بدي دوب قلي قلي بجنون بحبك انا مجنون ما بنسى حبك يوم'])
I get the following scores: [DIDPred(top='BEI', scores={'BEI': 0.5475092868164938, 'CAI': 0.05423997031019218, 'DOH': 0.018378809169102468, 'MSA': 0.003793013408907513, 'RAB': 0.0018751946461352397, 'TUN': 0.37420372564916876})]
camel_tools 1.5.2 on mac 14.1.1
Using the preloaded example sentences in the ADIDA interface, for instance: "بدي دوب قلي قلي بجنون بحبك انا مجنون ما بنسى حبك يوم" I get a score of 95.9% for Beirut When I try to predict the same sentence using camel_tools, I get a different result. For example, using model26 which I assume is the same as in ADIDA
I get the following scores:
[DIDPred(top='ALE', scores={'ALE': 0.2744463749182225, 'ALG': 0.0019964477414507772, 'ALX': 0.0017124356871910278, 'AMM': 0.04793813798943018, ...
Similarly using model6, I also get different and lower scores than the online interface (but at least dialect is correct).
I get the following scores:
[DIDPred(top='BEI', scores={'BEI': 0.5475092868164938, 'CAI': 0.05423997031019218, 'DOH': 0.018378809169102468, 'MSA': 0.003793013408907513, 'RAB': 0.0018751946461352397, 'TUN': 0.37420372564916876})]