nlplab / brat

brat rapid annotation tool (brat) - for all your textual annotation needs
http://brat.nlplab.org
Other
1.82k stars 509 forks source link

drawing an arc too slow #974

Open seyyaw opened 11 years ago

seyyaw commented 11 years ago

I have some 100 sentences annotated with POS. drawing an arc from source to target takes tooo much time, it seems it first calculates all possible targets (in this case all the POS annotated tokens are possible) and get the arc lables from all of them (while all of them are the same).

My json response for getCollectionInformation looks like this one:

"arcs":[{"color":"green","arrowHead":"triangle,5","labels":["$"],"hotkey":"d","type":"$","targets":["$(","$,","$.","ADJA","ADJD","ADV","APPO","APPR","APPRART","APZR","ART","CARD","FM","ITJ","KOKOM","KON","KOUI","KOUS","NE","NN","PAV","PDAT","PDS","PIAT","PIDAT","PIS","PPER","PPOSAT","PPOSS","PRELAT","PRELS","PRF","PROAV","PTKA","PTKANT","PTKNEG","PTKVZ","PTKZU","PWAT","PWAV","PWS","TRUNC","VAFIN","VAIMP","VAINF","VAPP","VMFIN","VMINF","VMPP","VVFIN","VVIMP","VVINF","VVIZU","VVPP","XY"],"dashArray":""},{"color":"green","arrowHead":"triangle,5","labels":["--"],"hotkey":"d","type":"--","targets":["$(","$,","$.","ADJA","ADJD","ADV","APPO","APPR","APPRART","APZR","ART","CARD","FM","ITJ","KOKOM","KON","KOUI","KOUS","NE","NN","PAV","PDAT","PDS","PIAT","PIDAT","PIS","PPER","PPOSAT","PPOSS","PRELAT","PRELS","PRF","PROAV","PTKA","PTKANT","PTKNEG","PTKVZ","PTKZU","PWAT","PWAV","PWS","TRUNC","VAFIN","VAIMP","VAINF","VAPP","VMFIN","VMINF","VMPP","VVFIN","VVIMP","VVINF","VVIZU","VVPP","XY"],"dashArray":""},{"color":"green","arrowHead":"triangle,5","labels":["-PUNCT-"],"hotkey":"d","type":"-PUNCT-","targets":["$(","$,","$.","ADJA","ADJD","ADV","APPO","APPR","APPRART","APZR","ART","CARD","FM","ITJ","KOKOM","KON","KOUI","KOUS","NE","NN","PAV","PDAT","PDS","PIAT","PIDAT","PIS","PPER","PPOSAT","PPOSS","PRELAT","PRELS","PRF","PROAV","PTKA","PTKANT","PTKNEG","PTKVZ","PTKZU","PWAT","PWAV","PWS","TRUNC","VAFIN","VAIMP","VAINF","VAPP","VMFIN","VMINF","VMPP","VVFIN","VVIMP","VVINF","VVIZU","VVPP","XY"],"dashArray":""},{"color":"green","arrowHead":"triangle,5","labels":["AC"],"hotkey":"d","type":"AC","targets":["$(","$,","$.","ADJA","ADJD","ADV","APPO","APPR","APPRART","APZR","ART","CARD","FM","ITJ","KOKOM","KON","KOUI","KOUS","NE","NN","PAV","PDAT","PDS","PIAT","PIDAT","PIS","PPER","PPOSAT","PPOSS","PRELAT","PRELS","PRF","PROAV","PTKA","PTKANT","PTKNEG","PTKVZ","PTKZU","PWAT","PWAV","PWS","TRUNC","VAFIN","VAIMP","VAINF","VAPP","VMFIN","VMINF","VMPP","VVFIN","VVIMP","VVINF","VVIZU","VVPP","XY"],"dashArray":""},{"color":"green","arrowHead":"triangle,5","labels":["ADV"],"hotkey":"d","type":"ADV","targets":["$(","$,","$.","ADJA","ADJD","ADV","APPO","APPR","APPRART","APZR","ART","CARD","FM","ITJ","KOKOM","KON","KOUI","KOUS","NE","NN","PAV","PDAT","PDS","PIAT","PIDAT","PIS","PPER","PPOSAT","PPOSS","PRELAT","PRELS","PRF","PROAV","PTKA","PTKANT","PTKNEG","PTKVZ","PTKZU","PWAT","PWAV","PWS","TRUNC","VAFIN","VAIMP","VAINF","VAPP","VMFIN","VMINF","VMPP","VVFIN","VVIMP","VVINF","VVIZU","VVPP","XY"],"dashArray":""},{"color":"green","arrowHead":"triangle,5","labels":["AG"],"hotkey":"d","type":"AG","targets":["$(","$,","$.","ADJA","ADJD","ADV","APPO","APPR","APPRART","APZR","ART","CARD","FM","ITJ","KOKOM","KON","KOUI","KOUS","NE","NN","PAV","PDAT","PDS","PIAT","PIDAT","PIS","PPER","PPOSAT","PPOSS","PRELAT","PRELS","PRF","PROAV","PTKA","PTKANT","PTKNEG","PTKVZ","PTKZU","PWAT","PWAV","PWS","TRUNC","VAFIN","VAIMP","VAINF","VAPP","VMFIN","VMINF","VMPP","VVFIN","VVIMP","VVINF","VVIZU","VVPP","XY"],"dashArray":""},{"color":"green","arrowHead":"triangle,5","labels":["APP"],"hotkey":"d","type":"APP","targets":["$(","$,","$.","ADJA","ADJD","ADV","APPO","APPR","APPRART","APZR","ART","CARD","FM","ITJ","KOKOM","KON","KOUI","KOUS","NE","NN","PAV","PDAT","PDS","PIAT","PIDAT","PIS","PPER","PPOSAT","PPOSS","PRELAT","PRELS","PRF","PROAV","PTKA","PTKANT","PTKNEG","PTKVZ","PTKZU","PWAT","PWAV","PWS","TRUNC","VAFIN","VAIMP","VAINF","VAPP","VMFIN","VMINF","VMPP","VVFIN","VVIMP","VVINF","VVIZU","VVPP","XY"],"dashArray":""},{"color":"green","arrowHead":"triangle,5","labels":["ATTR"],"hotkey":"d","type":"ATTR","targets":["$(","$,","$.","ADJA","ADJD","ADV","APPO","APPR","APPRART","APZR","ART","CARD","FM","ITJ","KOKOM","KON","KOUI","KOUS","NE","NN","PAV","PDAT","PDS","PIAT","PIDAT","PIS","PPER","PPOSAT","PPOSS","PRELAT","PRELS","PRF","PROAV","PTKA","PTKANT","PTKNEG","PTKVZ","PTKZU","PWAT","PWAV","PWS","TRUNC","VAFIN","VAIMP","VAINF","VAPP","VMFIN","VMINF","VMPP","VVFIN","VVIMP","VVINF","VVIZU","VVPP","XY"],"dashArray":""},{"color":"green","arrowHead":"triangle,5","labels":["AUX"],"hotkey":"d","type":"AUX","targets":["$(","$,","$.","ADJA","ADJD","ADV","APPO","APPR","APPRART","APZR","ART","CARD","FM","ITJ","KOKOM","KON","KOUI","KOUS","NE","NN","PAV","PDAT","PDS","PIAT","PIDAT","PIS","PPER","PPOSAT","PPOSS","PRELAT","PRELS","PRF","PROAV","PTKA","PTKANT","PTKNEG","PTKVZ","PTKZU","PWAT","PWAV","PWS","TRUNC","VAFIN","VAIMP","VAINF","VAPP","VMFIN","VMINF","VMPP","VVFIN","VVIMP","VVINF","VVIZU","VVPP","XY"],"dashArray":""},{"color":"green","arrowHead":"triangle,5","labels":["AVZ"],"hotkey":"d","type":"AVZ","targets":["$(","$,","$.","ADJA","ADJD","ADV","APPO","APPR","APPRART","APZR","ART","CARD","FM","ITJ","KOKOM","KON","KOUI","KOUS","NE","NN","PAV","PDAT","PDS","PIAT","PIDAT","PIS","PPER","PPOSAT","PPOSS","PRELAT","PRELS","PRF","PROAV","PTKA","PTKANT","PTKNEG","PTKVZ","PTKZU","PWAT","PWAV","PWS","TRUNC","VAFIN","VAIMP","VAINF","VAPP","VMFIN","VMINF","VMPP","VVFIN","VVIMP","VVINF","VVIZU","VVPP","XY"],"dashArray":""},{"color":"green","arrowHead":"triangle,5","labels":["CC"],"hotkey":"d","type":"CC","targets":["$(","$,","$.","ADJA","ADJD","ADV","APPO","APPR","APPRART","APZR","ART","CARD","FM","ITJ","KOKOM","KON", ......................... Thanks

ghost commented 11 years ago

Assigning to @spyysalo, this is related most likely related to the inefficient way configurations are currently transferred and may require both client-side and server-side fixes.

spyysalo commented 11 years ago

@ninjin : not transfer-related, arc draw is client-only. Assigning @amadanmath .

@seyyaw : could you please provide your annotation.conf for testing?

amadanmath commented 11 years ago

@seyyaw: I'll second the request for annotation.conf - I don't have any configuration large enough to observe the slowness. Also, an example of an annotated document (if the text itself is confidential, .ann file by itself is fine too, I'll generate some lorem ipsum). I'd like to approximate the complexity of your working conditions as much as possible.

reckart commented 11 years ago

@seyyaw: Can we synthesize an annotation.conf?

@amadanmath: We don't work with the BRAT config files, we generate the JSON directly from a Java-based back-end. Would a JSON block as documented here suffice as well?

http://brat.nlplab.org/embed.html

spyysalo commented 11 years ago

@ninjin is best equipped to answer questions about embedding.

spyysalo commented 11 years ago

@amadanmath : if there's no response from the submitter, you could try increasing the POS number in the example dependency annotation config configurations/dependency-example/annotation.conf. (Just add any number of "fake" POS types, say "POS1", "POS2", ... in the [entities] section until it gets slow.)

spyysalo commented 11 years ago

Also, you might want to add "fake" dependency types in [relations] too, as the complexity likely arises primarily from the combination entities * relations.

Note that this is related in part to #934, which does contain an example of a large configuration.

amadanmath commented 11 years ago

@amadanmath: We don't work with the BRAT config files, we generate the JSON directly from a Java-based back-end. Would a JSON block as documented here suffice as well?

@seyyaw, @reckart: Sorry, my brain conveniently chose to ignore that part. Sure, just dump the complete JSON somewhere I can download (preferably the pair of collection + document, if possible). :)

seyyaw commented 11 years ago

Hi all, for some reason, I was not recieving all mails from the issue. we are not using annotation.conf file. we are directly generating the json from java backend+ database. Sorry for the delay. Let me know how I may help

amadanmath commented 11 years ago

Okay, I think that should do something. The core issue is having such a huge number of annotations in one document; my computer takes about 1:15s to decide to refresh, then another 25s to render this document. However, once it is rendered, you should see a significant increase in selection arc drawing. It is still much slower than I would like though, but it is now not completely unusable.

That said, @spyysalo we might consider having a switch somewhere to disable clientside detection of valid targets, if this is not yet satisfactory.

seyyaw commented 11 years ago

Thanks @amadanmath It would be nice seeing how it works if clientside target detection is disabled

spyysalo commented 11 years ago

Re: disabling client target detection: suggest to use a per-user option as different annotators working on a single collection might have different preferences for this. We could initially just add a variable to client/src/configuration.js for controlling this, and then hook it up to UI and session data if it proves useful.

amadanmath commented 11 years ago

Tested more. Disabling target detection is not the biggest slowdown now. Further tests needed (complicated by the fact that one reload takes ages).