clulab / reach

Reach Biomedical Information Extraction
Other
97 stars 39 forks source link

Mentions should be serializable #108

Closed hickst closed 8 years ago

hickst commented 8 years ago

Can't serialize mentions unless KBResolution is serializable:

serializing ... error java.io.NotSerializableException: edu.arizona.sista.reach.grounding.KBResolution java.io.NotSerializableException: edu.arizona.sista.reach.grounding.KBResolution at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1378) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) ...

myedibleenso commented 8 years ago

Adding extends Serializable is enough for that, but it seems a with Serializable needs to be added to KBEntry as well in order for Bio/CorefMentions to become serializable again.

hickst commented 8 years ago

GHP says: I haven’t run it yet, but I think this (in the console) will give us the full error:

import edu.arizona.sista.reach.ReachSystem
import edu.arizona.sista.utils.Serializer
import edu.arizona.sista.odin._

val phosText = "Mek was not phosphorylated."

val rs = new ReachSystem()
val mentions = rs.extractFrom(phosText, "serialization-test", "1")
val dataset = Serializer.save[Seq[Mention]](mentions, "poop.ser")
myedibleenso commented 8 years ago

I ran it and there is no error, provided you've added with Serializable or extends Serializable to those two grounding classes.

myedibleenso commented 8 years ago

Maybe we can adapt that to a test:

import edu.arizona.sista.reach.ReachSystem
import edu.arizona.sista.utils.Serializer
import edu.arizona.sista.arizona.odin._

"Serializer" should "be able to serialize a Seq[Mention]" in {
  val phosText = "Mek was not phosphorylated."

  val mentions = testReach.extractFrom(phosText, "serialization-test", "1")
  // create temp file
  val tempFile = java.io.File.createTempFile("mentions", ".ser")
  // serialize mentions
  val dataset = Serializer.save[Seq[Mention]](mentions, tempFile.getPath)
  tempFile.exists should be(true)
}

it should "be able to load a serialized Seq[Mention]" in {
  ...
}
marcovzla commented 8 years ago

Serializer.save() return type is Unit

myedibleenso commented 8 years ago

oops! I kept that from the load stuff. I'll fix it.

myedibleenso commented 8 years ago

Revised version:

import edu.arizona.sista.reach.ReachSystem
import edu.arizona.sista.utils.Serializer
import edu.arizona.sista.arizona.odin._

"Serializer" should "be able to serialize a Seq[Mention]" in {
  val phosText = "Mek was not phosphorylated."

  val mentions = testReach.extractFrom(phosText, "serialization-test", "1")
  // create temp file
  val tempFile = java.io.File.createTempFile("mentions", ".ser")
  // serialize mentions
  Serializer.save[Seq[Mention]](mentions, tempFile.getPath)
  tempFile.exists should be(true)
}

it should "be able to load a serialized Seq[Mention]" in {
  ...
}
hickst commented 8 years ago

I'm using very similar code but I'm still getting this error:

[info] TestMentionSerialization:
[info] Mouse AKT1 phosphorylates PTHR2 in chicken adenoids.
[info] - should produce 7 mentions (13 milliseconds)
[info] Serializer
[info] - should write serialized mentions *** FAILED *** (67 milliseconds)
[info]   java.io.NotSerializableException: scala.collection.immutable.MapLike$$anon$2
[info]   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
[info]   at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
...
hickst commented 8 years ago

This seems to be a problem serializing any Mention which has context attached. I was able to create a test which succeeds on a phrase without context and fails on a phrase with context. See the test file 'TestMentionSerialization.scala' in branch 'issue108'.

I am reassigning this issue to Enrique.

myedibleenso commented 8 years ago

@hickst, if you already have a test, it might be useful to make your changes to KBResolution and KBEntry, add the test, and push that branch. Then @enoriega could just make the needed change to his stuff on that branch and he would know when/if adding extends Serializable to his stuff fixes the issue.

hickst commented 8 years ago

@myedibleenso You're tired dude....that's exactly what I described in the comment previous to yours. As I said in that comment, the branch is called 'issue108'.

myedibleenso commented 8 years ago

Ha! Sorry about that, @hickst. Yeah, I must've been really tired...

enoriega commented 8 years ago

Fixed. See pull request #133