clulab / processors

Natural Language Processors
https://clulab.github.io/processors/
417 stars 101 forks source link

Adverse effect of processors v9 on eidos #783

Open kwalcock opened 5 months ago

kwalcock commented 5 months ago

I'm not sure where (or if) these were recorded before. I'll try to get to the bottom of them here.

[info] *** 226 TESTS FAILED ***
[error] Failed tests:
[error]         org.clulab.wm.eidos.text.english.raps.TestRaps
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc5
[error]         org.clulab.wm.eidos.serialization.jsonld.TestJLDSerializer
[error]         org.clulab.wm.eidos.text.english.raps.TestRaps1
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc8
[error]         org.clulab.wm.eidos.text.englishGrounding.TestGrounding
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP1
[error]         org.clulab.wm.eidos.text.english.cag.TestExtraText
[error]         org.clulab.wm.eidos.serialization.TestDocSerialization
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP0
[error]         org.clulab.wm.eidos.text.englishGrounding.TestSpecificGroundings
[error]         org.clulab.wm.eidos.utils.TestLauncher
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc2
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP4
[error]         org.clulab.wm.eidos.system.TestCrLf
[error]         org.clulab.wm.eidos.serialization.jsonld.TestJLDDeserializer
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc3
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc6
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP3
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc1
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc4
[error]         org.clulab.wm.eidos.text.englishGrounding.TestGrounderStability
[error]         org.clulab.wm.eidos.system.TestEidosMention
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP2
[error]         org.clulab.wm.eidos.document.TestSentenceClassifier
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc7
[error] (Test / test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 2235 s (37:15), completed Feb 13, 2024 10:09:21 AM
sbt:eidos>
MihaiSurdeanu commented 5 months ago

Thank you @kwalcock !!

kwalcock commented 4 months ago

@MihaiSurdeanu, TestJLDSerializer is failing because one date does not get turned into an attachment. This seems to be because an entity in a sentence is expected to be DATE in eidos and it was so using the old version of processors, but it is B-DATE in the new version. Does this ring any bells?

MihaiSurdeanu commented 4 months ago

Ah, I see. This happens because we use the BIO notation for named and numeric entities, whereas CoreNLP does not. This is a small change that does not matter, so I think we should adjust the unit tests!

kwalcock commented 4 months ago

It doesn't matter much, but the particular tests would be difficult to change. Instead, for now I've converted B-DATE and I-DATE to DATE and errors for two unit tests went away.

The next problem is that eidos is seeing empty strings for norms where earlier it had seen O. I'm patching that up as well. Is it an expected change?

MihaiSurdeanu commented 4 months ago

No, that's another instance of me forgetting what I did before :)

I'm now thinking perhaps it's simpler to path things up directly in processors, to: