Wordseer / wordseer

The WordSeer text analysis tool, written in Flask.
http://wordseer.berkeley.edu/
40 stars 16 forks source link

All properties have same value after processing structure file #123

Closed jannah closed 10 years ago

jannah commented 10 years ago

After manually fixing the above, I managed to run it and have the document structure with all the sentences and everything come out properly. However, the properties are messed up; for each tweet, all of its properties have the same value, which seems to be this:

value = unicode(etree.tostring(node.getparent(), encoding="utf-8", method="text")).strip()

from structureextractor.py line 280. I'm not exactly sure what's causing this.

keien commented 10 years ago

@jannah Is your updated branch ready to be merged in?

jannah commented 10 years ago

It's merged. but I couldnt find this issue. can you verify which attribute is being used for the value?

Regards, Hassan Jannah On Aug 2, 2014 12:19 PM, "Keien Ohta" notifications@github.com wrote:

@jannah https://github.com/jannah Is your updated branch ready to be merged in?

— Reply to this email directly or view it on GitHub https://github.com/Wordseer/wordseer_flask/issues/123#issuecomment-50971764 .

keien commented 10 years ago

It seems that it's only an issue with the Shakespeare set; I'll look more into it later. I was just asking because I wanted to test the aaron document with your tagger.

jannah commented 10 years ago

Start with a fresh structure file. It might be fixed after the latest changes.

Regards, Hassan Jannah On Aug 2, 2014 12:22 PM, "Keien Ohta" notifications@github.com wrote:

It seems that it's only an issue with the Shakespeare set; I'll look more into it later

— Reply to this email directly or view it on GitHub https://github.com/Wordseer/wordseer_flask/issues/123#issuecomment-50971841 .

keien commented 10 years ago

Is the elasticbeanstalk deployment updated with the bug fixes? I just created a structure file and the xpaths were still not relative

jannah commented 10 years ago

It is updated.

Xpaths are still absolute. Aditi said it would work. That is better than relative especially in complex documents. I just removed thr ending / as you asked.

Regards, Hassan Jannah On Aug 2, 2014 12:25 PM, "Keien Ohta" notifications@github.com wrote:

Is the elasticbeanstalk deployment updated with the bug fixes? I just created a structure file and the xpaths were still not relative

— Reply to this email directly or view it on GitHub https://github.com/Wordseer/wordseer_flask/issues/123#issuecomment-50971914 .

keien commented 10 years ago

@PlasmaSheep this means we need to change StructureExtractor.get_nodes_from_xpath because right now, nodes.xpath(xpath) when xpath is absolute will return all nodes that match the xpath, not just the ones that are the children of the node.

jannah commented 10 years ago

Or you can try to use the following

/descendants:: It might be children not descendants Give it a try Regards, Hassan Jannah On Aug 2, 2014 1:00 PM, "Keien Ohta" notifications@github.com wrote: > @PlasmaSheep https://github.com/PlasmaSheep this means we need to > change StructureExtractor.get_nodes_from_xpath because right now, > nodes.xpath(xpath) when xpath is absolute will return all nodes that > match the xpath, not just the ones that are the children of the node. > > — > Reply to this email directly or view it on GitHub > https://github.com/Wordseer/wordseer_flask/issues/123#issuecomment-50972774 > .
abendebury commented 10 years ago

Why do we need to use absolute xpaths? The original implementation used relative xpaths.

keien commented 10 years ago

@jannah @silverasm

jannah commented 10 years ago

I was out all day. We can talk about it in the meeting tomorrow and decide which is better.

Regards, Hassan M. Jannah


Mobile: 1-510-990-1418

On Sun, Aug 3, 2014 at 3:45 PM, Keien Ohta notifications@github.com wrote:

@jannah https://github.com/jannah @silverasm https://github.com/silverasm

— Reply to this email directly or view it on GitHub https://github.com/Wordseer/wordseer_flask/issues/123#issuecomment-51005759 .

jannah commented 10 years ago

I added two new features:

  1. You can now preview the structure file before saving it.
  2. I added a relative xpath to the xpaths array: [, ]

"xpaths": [ "/PLAY/TITLE", "./TITLE" ],

the changes are there on the same URL

Regards, Hassan M. Jannah


Mobile: 1-510-990-1418

On Mon, Aug 4, 2014 at 12:58 AM, Hassan Jannah hassan.jannah@gmail.com wrote:

I was out all day. We can talk about it in the meeting tomorrow and decide which is better.

Regards, Hassan M. Jannah


Mobile: 1-510-990-1418

On Sun, Aug 3, 2014 at 3:45 PM, Keien Ohta notifications@github.com wrote:

@jannah https://github.com/jannah @silverasm https://github.com/silverasm

— Reply to this email directly or view it on GitHub https://github.com/Wordseer/wordseer_flask/issues/123#issuecomment-51005759 .

abendebury commented 10 years ago

I think we've resolved this, I'll reopen if it's still an issue.