clulab / eidos

Machine reading system for World Modelers
Apache License 2.0
37 stars 24 forks source link

Lightweight eidos #891

Open BeckySharp opened 4 years ago

BeckySharp commented 4 years ago

Is it possible to move some stuff to subprojects or something, so we can have a lightweight eidos that can run without taking > 10 mins to package a jar?

MihaiSurdeanu commented 4 years ago

Not sure... If you're talking about building the fat jar, then this won't matter, because the fat jar includes all the dependencies.

kwalcock commented 4 years ago

Yes, what Mihai said. The difficulty is mainly with the glove file in my experience. We have not yet switched to the smaller version. That will help eventually. One can cheat on a local build by using the cached vectors and then removing the dependency from build.sbt.

MihaiSurdeanu commented 4 years ago

Should we just switch to the small version?

On Thu, Jul 16, 2020 at 09:17 Keith Alcock notifications@github.com wrote:

Yes, what Mihai said. The difficulty is mainly with the glove file in my experience. We have not yet switched to the smaller version. That will help eventually. One can cheat on a local build by using the cached vectors and then removing the dependency from build.sbt.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/clulab/eidos/issues/891#issuecomment-659518530, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI75TUJFCYIAX4V5PPCA7DR34R25ANCNFSM4O4QENLA .

kwalcock commented 4 years ago

A plain run, though, should not take that long. How is it that you are running it and what part is it that is running?

BeckySharp commented 4 years ago

i made a file with 1 sentence and did sbt run then selected 11 (extract and export) btw -- ran out of memory with 8g, so have to start all over.

On Thu, Jul 16, 2020 at 9:21 AM Keith Alcock notifications@github.com wrote:

External Email

A plain run, though, should not take that long. How is it that you are running it and what part is it that is running?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/clulab/eidos/issues/891#issuecomment-659520619, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJCPCNB5KA7NJ3DFWHT4O3R34SIDANCNFSM4O4QENLA .

kwalcock commented 4 years ago

I will try it, just not really soon. You might consider using IntelliJ for that, though. It leaves the dependencies in their own jars and manipulates the classpath to refer to them. sbt seems to put everything into a single jar, requiring lots of time, and then call that directly.

BeckySharp commented 4 years ago

thanks, it's not urgent. i tried in intelliJ first and it was hanging too i prob need to rebuild it there as well, i was just feeling frustrated by intelliJ for continuously seeming to need for me to to clean compiles, i guess i gave up too soon.

MihaiSurdeanu commented 4 years ago

Agreed. For smaller jobs, IntelliJ is better.

On July 16, 2020 at 9:31:15 AM, Keith Alcock (notifications@github.com) wrote:

I will try it, just not really soon. You might consider using IntelliJ for that, though. It leaves the dependencies in their own jars and manipulates the classpath to refer to them. sbt seems to put everything into a single jar, requiring lots of time, and then call that directly.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/clulab/eidos/issues/891#issuecomment-659526398, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI75TW6JHTCEJT445MTQ2LR34TM7ANCNFSM4O4QENLA .

kwalcock commented 4 years ago

IntelliJ notices fairly well when the main build.sbt changes, but poorly if something like webapp/build.sbt changes. That's been a problem for me that can necessitate a new, nonincremental build. Sometimes I forget to add something like -Xmx8g to the VM options as well.

I was hoping to switch to the smaller vector file when the geolocations are moved to Metal and there is a another vector file already included (if I remember correctly).

BeckySharp commented 4 years ago

well if someone can generate jsonld for this sentence I'd be indebted:

Marginal improvements in levels of acute malnutrition are expected due to consumption of household production.

On Thu, Jul 16, 2020 at 9:49 AM Keith Alcock notifications@github.com wrote:

External Email

IntelliJ notices fairly well when the main build.sbt changes, but poorly if something like webapp/build.sbt changes. That's been a problem for me that can necessitate a new, nonincremental build. Sometimes I forget to add something like -Xmx8g to the VM options as well.

I was hoping to switch to the smaller vector file when the geolocations are moved to Metal and there is a another vector file already included (if I remember correctly).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/clulab/eidos/issues/891#issuecomment-659536763, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJCPCPUQMC7BROMMPMPYCLR34VRVANCNFSM4O4QENLA .

BeckySharp commented 4 years ago

it's for the hackathon at 10

On Thu, Jul 16, 2020 at 10:01 AM Becky Sharp bsharpataz@gmail.com wrote:

well if someone can generate jsonld for this sentence I'd be indebted:

Marginal improvements in levels of acute malnutrition are expected due to consumption of household production.

On Thu, Jul 16, 2020 at 9:49 AM Keith Alcock notifications@github.com wrote:

External Email

IntelliJ notices fairly well when the main build.sbt changes, but poorly if something like webapp/build.sbt changes. That's been a problem for me that can necessitate a new, nonincremental build. Sometimes I forget to add something like -Xmx8g to the VM options as well.

I was hoping to switch to the smaller vector file when the geolocations are moved to Metal and there is a another vector file already included (if I remember correctly).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/clulab/eidos/issues/891#issuecomment-659536763, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJCPCPUQMC7BROMMPMPYCLR34VRVANCNFSM4O4QENLA .

kwalcock commented 4 years ago

It's running. Had the cache off. Hope hackathon is at 11.

kwalcock commented 4 years ago

I don't see that it found anything.

{
  "@context" : {
    "Corpus" : "https://w3id.org/wm/cag/corpus",
    "Dependency" : "https://w3id.org/wm/cag/dependency",
    "Document" : "https://w3id.org/wm/cag/document",
    "Sentence" : "https://w3id.org/wm/cag/sentence",
    "Word" : "https://w3id.org/wm/cag/word"
  },
  "@type" : "Corpus",
  "documents" : [ {
    "@type" : "Document",
    "@id" : "_:Document_1",
    "text" : "Marginal improvements in levels of acute malnutrition are expected due to\nconsumption of household production.",
    "sentences" : [ {
      "@type" : "Sentence",
      "@id" : "_:Sentence_1",
      "text" : "Marginal improvements in levels of acute malnutrition are expected due to consumption of household production .",
      "words" : [ {
        "@type" : "Word",
        "@id" : "_:Word_1",
        "text" : "Marginal",
        "tag" : "JJ",
        "entity" : "B-Quantifier",
        "startOffset" : 0,
        "endOffset" : 8,
        "lemma" : "marginal",
        "chunk" : "B-NP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_2",
        "text" : "improvements",
        "tag" : "NNS",
        "entity" : "O",
        "startOffset" : 9,
        "endOffset" : 21,
        "lemma" : "improvement",
        "chunk" : "I-NP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_3",
        "text" : "in",
        "tag" : "IN",
        "entity" : "O",
        "startOffset" : 22,
        "endOffset" : 24,
        "lemma" : "in",
        "chunk" : "B-PP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_4",
        "text" : "levels",
        "tag" : "NNS",
        "entity" : "O",
        "startOffset" : 25,
        "endOffset" : 31,
        "lemma" : "level",
        "chunk" : "B-NP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_5",
        "text" : "of",
        "tag" : "IN",
        "entity" : "O",
        "startOffset" : 32,
        "endOffset" : 34,
        "lemma" : "of",
        "chunk" : "B-PP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_6",
        "text" : "acute",
        "tag" : "JJ",
        "entity" : "O",
        "startOffset" : 35,
        "endOffset" : 40,
        "lemma" : "acute",
        "chunk" : "B-NP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_7",
        "text" : "malnutrition",
        "tag" : "NN",
        "entity" : "CAUSE_OF_DEATH",
        "startOffset" : 41,
        "endOffset" : 53,
        "lemma" : "malnutrition",
        "chunk" : "I-NP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_8",
        "text" : "are",
        "tag" : "VBP",
        "entity" : "O",
        "startOffset" : 54,
        "endOffset" : 57,
        "lemma" : "be",
        "chunk" : "B-VP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_9",
        "text" : "expected",
        "tag" : "VBN",
        "entity" : "O",
        "startOffset" : 58,
        "endOffset" : 66,
        "lemma" : "expect",
        "chunk" : "I-VP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_10",
        "text" : "due",
        "tag" : "JJ",
        "entity" : "O",
        "startOffset" : 67,
        "endOffset" : 70,
        "lemma" : "due",
        "chunk" : "B-ADJP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_11",
        "text" : "to",
        "tag" : "TO",
        "entity" : "O",
        "startOffset" : 71,
        "endOffset" : 73,
        "lemma" : "to",
        "chunk" : "B-PP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_12",
        "text" : "consumption",
        "tag" : "NN",
        "entity" : "O",
        "startOffset" : 74,
        "endOffset" : 85,
        "lemma" : "consumption",
        "chunk" : "B-NP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_13",
        "text" : "of",
        "tag" : "IN",
        "entity" : "O",
        "startOffset" : 86,
        "endOffset" : 88,
        "lemma" : "of",
        "chunk" : "B-PP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_14",
        "text" : "household",
        "tag" : "NN",
        "entity" : "O",
        "startOffset" : 89,
        "endOffset" : 98,
        "lemma" : "household",
        "chunk" : "B-NP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_15",
        "text" : "production",
        "tag" : "NN",
        "entity" : "O",
        "startOffset" : 99,
        "endOffset" : 109,
        "lemma" : "production",
        "chunk" : "I-NP",
        "norm" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_16",
        "text" : ".",
        "tag" : ".",
        "entity" : "O",
        "startOffset" : 109,
        "endOffset" : 110,
        "lemma" : ".",
        "chunk" : "O",
        "norm" : "O"
      } ],
      "dependencies" : [ {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_7"
        },
        "destination" : {
          "@id" : "_:Word_5"
        },
        "relation" : "case"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_12"
        },
        "destination" : {
          "@id" : "_:Word_10"
        },
        "relation" : "case"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_10"
        },
        "destination" : {
          "@id" : "_:Word_11"
        },
        "relation" : "mwe"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_2"
        },
        "destination" : {
          "@id" : "_:Word_1"
        },
        "relation" : "amod"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_12"
        },
        "destination" : {
          "@id" : "_:Word_15"
        },
        "relation" : "nmod_of"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_4"
        },
        "destination" : {
          "@id" : "_:Word_3"
        },
        "relation" : "case"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_4"
        },
        "destination" : {
          "@id" : "_:Word_7"
        },
        "relation" : "nmod_of"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_9"
        },
        "destination" : {
          "@id" : "_:Word_12"
        },
        "relation" : "nmod_due_to"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_9"
        },
        "destination" : {
          "@id" : "_:Word_16"
        },
        "relation" : "punct"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_2"
        },
        "destination" : {
          "@id" : "_:Word_4"
        },
        "relation" : "nmod_in"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_7"
        },
        "destination" : {
          "@id" : "_:Word_6"
        },
        "relation" : "amod"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_9"
        },
        "destination" : {
          "@id" : "_:Word_2"
        },
        "relation" : "nsubjpass"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_15"
        },
        "destination" : {
          "@id" : "_:Word_13"
        },
        "relation" : "case"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_15"
        },
        "destination" : {
          "@id" : "_:Word_14"
        },
        "relation" : "compound"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_9"
        },
        "destination" : {
          "@id" : "_:Word_8"
        },
        "relation" : "auxpass"
      } ]
    } ]
  } ]
}
kwalcock commented 4 years ago

The webapp found things. I'll change branches and check again.

BeckySharp commented 4 years ago

pinging on slack

On Thu, Jul 16, 2020 at 10:16 AM Keith Alcock notifications@github.com wrote:

External Email

I don't see that it found anything.

{ "@context" : { "Corpus" : "https://w3id.org/wm/cag/corpus", "Dependency" : "https://w3id.org/wm/cag/dependency", "Document" : "https://w3id.org/wm/cag/document", "Sentence" : "https://w3id.org/wm/cag/sentence", "Word" : "https://w3id.org/wm/cag/word" }, "@type" : "Corpus", "documents" : [ { "@type" : "Document", "@id" : "_:Document1", "text" : "Marginal improvements in levels of acute malnutrition are expected due to\nconsumption of household production.", "sentences" : [ { "@type" : "Sentence", "@id" : ":Sentence1", "text" : "Marginal improvements in levels of acute malnutrition are expected due to consumption of household production .", "words" : [ { "@type" : "Word", "@id" : ":Word1", "text" : "Marginal", "tag" : "JJ", "entity" : "B-Quantifier", "startOffset" : 0, "endOffset" : 8, "lemma" : "marginal", "chunk" : "B-NP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word2", "text" : "improvements", "tag" : "NNS", "entity" : "O", "startOffset" : 9, "endOffset" : 21, "lemma" : "improvement", "chunk" : "I-NP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word3", "text" : "in", "tag" : "IN", "entity" : "O", "startOffset" : 22, "endOffset" : 24, "lemma" : "in", "chunk" : "B-PP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word4", "text" : "levels", "tag" : "NNS", "entity" : "O", "startOffset" : 25, "endOffset" : 31, "lemma" : "level", "chunk" : "B-NP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word5", "text" : "of", "tag" : "IN", "entity" : "O", "startOffset" : 32, "endOffset" : 34, "lemma" : "of", "chunk" : "B-PP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word6", "text" : "acute", "tag" : "JJ", "entity" : "O", "startOffset" : 35, "endOffset" : 40, "lemma" : "acute", "chunk" : "B-NP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word_7", "text" : "malnutrition", "tag" : "NN", "entity" : "CAUSE_OFDEATH", "startOffset" : 41, "endOffset" : 53, "lemma" : "malnutrition", "chunk" : "I-NP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word8", "text" : "are", "tag" : "VBP", "entity" : "O", "startOffset" : 54, "endOffset" : 57, "lemma" : "be", "chunk" : "B-VP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word9", "text" : "expected", "tag" : "VBN", "entity" : "O", "startOffset" : 58, "endOffset" : 66, "lemma" : "expect", "chunk" : "I-VP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word10", "text" : "due", "tag" : "JJ", "entity" : "O", "startOffset" : 67, "endOffset" : 70, "lemma" : "due", "chunk" : "B-ADJP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word11", "text" : "to", "tag" : "TO", "entity" : "O", "startOffset" : 71, "endOffset" : 73, "lemma" : "to", "chunk" : "B-PP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word12", "text" : "consumption", "tag" : "NN", "entity" : "O", "startOffset" : 74, "endOffset" : 85, "lemma" : "consumption", "chunk" : "B-NP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word13", "text" : "of", "tag" : "IN", "entity" : "O", "startOffset" : 86, "endOffset" : 88, "lemma" : "of", "chunk" : "B-PP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word14", "text" : "household", "tag" : "NN", "entity" : "O", "startOffset" : 89, "endOffset" : 98, "lemma" : "household", "chunk" : "B-NP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word15", "text" : "production", "tag" : "NN", "entity" : "O", "startOffset" : 99, "endOffset" : 109, "lemma" : "production", "chunk" : "I-NP", "norm" : "O" }, { "@type" : "Word", "@id" : ":Word16", "text" : ".", "tag" : ".", "entity" : "O", "startOffset" : 109, "endOffset" : 110, "lemma" : ".", "chunk" : "O", "norm" : "O" } ], "dependencies" : [ { "@type" : "Dependency", "source" : { "@id" : ":Word7" }, "destination" : { "@id" : ":Word5" }, "relation" : "case" }, { "@type" : "Dependency", "source" : { "@id" : ":Word12" }, "destination" : { "@id" : ":Word10" }, "relation" : "case" }, { "@type" : "Dependency", "source" : { "@id" : ":Word10" }, "destination" : { "@id" : ":Word11" }, "relation" : "mwe" }, { "@type" : "Dependency", "source" : { "@id" : ":Word2" }, "destination" : { "@id" : ":Word1" }, "relation" : "amod" }, { "@type" : "Dependency", "source" : { "@id" : ":Word12" }, "destination" : { "@id" : ":Word_15" }, "relation" : "nmodof" }, { "@type" : "Dependency", "source" : { "@id" : ":Word4" }, "destination" : { "@id" : ":Word3" }, "relation" : "case" }, { "@type" : "Dependency", "source" : { "@id" : ":Word4" }, "destination" : { "@id" : ":Word_7" }, "relation" : "nmodof" }, { "@type" : "Dependency", "source" : { "@id" : ":Word9" }, "destination" : { "@id" : ":Word_12" }, "relation" : "nmod_dueto" }, { "@type" : "Dependency", "source" : { "@id" : ":Word9" }, "destination" : { "@id" : ":Word16" }, "relation" : "punct" }, { "@type" : "Dependency", "source" : { "@id" : ":Word2" }, "destination" : { "@id" : ":Word_4" }, "relation" : "nmodin" }, { "@type" : "Dependency", "source" : { "@id" : ":Word7" }, "destination" : { "@id" : ":Word6" }, "relation" : "amod" }, { "@type" : "Dependency", "source" : { "@id" : ":Word9" }, "destination" : { "@id" : ":Word2" }, "relation" : "nsubjpass" }, { "@type" : "Dependency", "source" : { "@id" : ":Word15" }, "destination" : { "@id" : ":Word13" }, "relation" : "case" }, { "@type" : "Dependency", "source" : { "@id" : ":Word15" }, "destination" : { "@id" : ":Word14" }, "relation" : "compound" }, { "@type" : "Dependency", "source" : { "@id" : ":Word9" }, "destination" : { "@id" : ":Word_8" }, "relation" : "auxpass" } ] } ] } ] }

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/clulab/eidos/issues/891#issuecomment-659552003, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJCPCLQBR6XHDPXSNZA2BDR34YXZANCNFSM4O4QENLA .