stanfordnlp / CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
http://stanfordnlp.github.io/CoreNLP/
GNU General Public License v3.0
9.69k stars 2.7k forks source link

Broken Pipe and Concurrent Timeout Exceptions from CoreNLP Server #672

Open iq-dot opened 6 years ago

iq-dot commented 6 years ago

Hi guys,

I am receiving several errors that is continually appearing every few minutes and I am not sure why, can someone please tell me what these following errors maybe caused by?

Mainly two errors, one is a IOException broken pipe (the most common) and another one is a timeout error from Concurrent execution. Screenshot attached.

screen shot 2018-04-10 at 16 32 16 screen shot 2018-04-10 at 16 37 03
nzv8fan commented 6 years ago

I've got the same issue, I send one large/long sentence to score and then the next time I call the API the pipe is broken. E.g.:

Thus it was not rare to find, on the Sunday, the tallboy on its feet by the fire, and the dressing table on its head by the bed, and the night-stool on its face by the door, and the washand-stand on its back by the window; and, on the Monday, the tallboy on its back by the bed, and the dressing table on its face by the door, and the night-stool on its back by the window and the washand-stand on its feet by the fire; and on the Tuesday…

Call the CoreNLP server a few times rapidly with that sentence and it will go down.

[pool-1-thread-3] INFO CoreNLP - [/127.0.0.1:45274] API call w/annotators tokenize,ssplit,pos,parse,sentiment
java.io.IOException: Broken pipe
    at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
    at sun.nio.ch.IOUtil.write(IOUtil.java:65)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
    at sun.net.httpserver.Request$WriteStream.write(Request.java:391)
    at sun.net.httpserver.FixedLengthOutputStream.write(FixedLengthOutputStream.java:78)
    at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
    at sun.net.httpserver.PlaceholderOutputStream.write(ExchangeImpl.java:439)
    at edu.stanford.nlp.pipeline.StanfordCoreNLPServer$CoreNLPHandler.handle(StanfordCoreNLPServer.java:883)
    at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
    at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
    at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
    at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
    at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
    at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
J38 commented 6 years ago

I'm having trouble reproducing this error. Could you provide more details. For instance, how many times are you issuing this request? Secondly, by what means are you issuing this request? Also, what settings are you using when you launch the server?

I can generate "Broken Pipe" errors, but they do not bring down the server.

iq-dot commented 6 years ago

So I am running the Core NLP server, this exact distribution: stanford-corenlp-full-2018-02-27

I run it using 4 threads, here is my Java command:

java -cp "*" -mx4g edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port $PORT -timeout 15000 -annotators "tokenize" -preload "tokenize,ssplit,pos,lemma,ner" -threads 4

Also the text is passed as a HTTP request to the server.

J38 commented 6 years ago

Could you show me a sample request?

J38 commented 6 years ago

Also, have you tried running the server with more RAM? Depending on what annotators you are using, 4 GB might not be enough.

iq-dot commented 6 years ago

Hi,

So we make a http call like this:

const nlpAPI = 'http://stanford-host.com';
const nlpOptions = {
    annotators:'tokenize,ner',
    outputFormat: 'json'
};
    return request
        .post(nlpAPI)
        .timeout({ response: 15000 })
        .query({ properties: JSON.stringify(nlpOptions) })
        .send(text)
        .then(onGotResp);

Note that even though I pass in those two annotators, NLP Server still processes and uses all these annotators: tokenize,ssplit,pos,lemma,ner.

Event though I don't really need for example the lemma. So if I could force it not to use those annotators then it would help with memory.

So how much RAM would you recommend? Is it also because I am trying to run 4 threads? My box is a 4 core 4gb system. Perhaps I should just run single threaded servers with multiple instances using docker? Would that help?

nzv8fan commented 6 years ago

On my side, there are two ways to reproduce and two work arounds. FYI I'm using the sentiment annotator but I think the issue is in the parsing.

I can't share a full example as the data I'm using is sensitive.

Problem one: Strings given as input which contain no valid words hang the parser. E.g. a very long string of "++++++++++++++++++++++++++++++++++++++++++++++++++------------------------------------------------------------" will just hang the parser (shorter variants of this nonsense will return). Solution: do a check of the input data to ensure there is at least one alphabet character in the string

Problem two: Extremely long (but valid sentences) will cause the same error. Solution: swap the parser for the alternative CoreNLP Lex Parser

asfer commented 6 years ago

I am running some load tests against CoreNLP with large texts and the number of timeouts on the load tests that I get seems to be equal or very close to the number of java.io.IOException: Broken pipe on the server logs.

Probably, a timeout on the client corresponds to a broken pipe on the service as the name suggests.

Also, TimeoutException are probably exceptions being thrown because server tasks are taking more time than the threshold. Currently, we do not see any probably because we increased the -timeout parameter at server startup.

iq-dot commented 6 years ago

Here is an example script from a text I am getting concurrent timeout errors with:

Nicki Minaj has confirmed rumours that she's dating Eminem, after answering a fan's question on Instagram on Thursday. The Chun-Li rap beauty, 35, took to the image-sharing app to promote a new single she's featured on — YG's Big Bank, alongside 2 Chainz and Big Sean — when one of her followers boldly enquired: 'You dating Eminem???'Trinidadian-born Nicki, who was most recently romantically linked to 44-year-old rap veteran Nas, simply wrote back: 'Yes.'Making sweet music together? Nicki Minaj, left, confirmed that she's dating fellow rapper Eminem, right, during an exchange with a fan on Instagram on Thursday Interestingly, she mentions Eminem during her verse on Big Bank, rapping: 'Uh oh/Back again/Back to back Maybach, stack the M's/Told 'em I met Slim Shady, bag the M/Once he go black, he'll be back again.'MailOnline has contacted representatives for Nicki Minaj and Eminem for comment. While it's unknown if Nicki was being serious in her response to the enquiring fan, she and Eminem — real name Marshall Mathers III — have long known each other, as they worked together on her hit 2010 single Roman's Revenge.After the track's release, Nicki revealed that she sent the rap superstar another track before they landed on Roman's Revenge, telling MTV News: 'He didn't say, "I don't love it"; he just said, "Can you send me something that's a little more me??"' Screenshot: On Thursday night, a fan took to Twitter to share a screenshot of the exchange Letting loose: Nicki took to Instagram on Friday to promote a new single she's featured on — YG's Big Bank, alongside 2 Chainz and Big Sean - when she made the confession Loving life: Nicki looked in good spirits in the video as she rapped to the camera whilst casually clad in a red padded jacket and printed baseball cap whilst sporing bright pink lipstick Once they settled on the Swizz Beatz-produced beat for Roman's Revenge, next came the lengthy writing process, which Nicki described as 'competitive'.'I remember, every time I wrote a verse to Roman's Revenge and sent it to Eminem, he would send a new verse back,' she told XXL. 'It was competitive, it was fun.'I think all the big male artists (air quotes) also treated me with a sense of respect as an emcee. They took me serious. If they were on a track with me, they knew they had to come hard.'At the start of January, it was confirmed that Nicki and Nas had parted ways in December after the strain of a long-distance relationship proved 'too much', with Nicki living on the west coast in Beverly Hills and Nas based in New York City. Looking: Late last year, Eminem — real name Marshall Mathers III — opened up about looking for love more than a decade after the end of his marriage to Kimberly Anne Scott (pictured)The couple are believed to have started dating in June 2017, but sources close to the situation told TMZ 'the relationship ran out of steam'.An insider added: 'They respect each other, and there won't be any trash talking — but, on the other hand, they won't be hanging out as friends either.'The site also pointed to recent reports claiming Nicki — real name Onika Tanya Maraj — was pregnant with the couple's child, but states there was 'no truth' in the rumour.Nas, born Nasir Bin Olu Dara Jones, is now said to be focusing on his record label and his chicken and waffles joint, Sweet Chick, in the wake of the split, while Nicki also juggles 'multiple business ventures outside of music'.At the time, neither party addressed the reports via their social media channels. The ex factor: In January, it was revealed that her romance with Nas had come to an end, after seven months of dating. Pictured in June 2017, when they are believed to have got togetherThe last interaction between the pair was when Nas wished Nicki a happy birthday via his Instagram page on December 8, writing: 'Get The MFkN Money! Happy Birthday To The QUEEN OF NY / HIP HOP @nickiminaj.' Nicki and Nas have been friends for many years, but never formally confirmed rumours they were dating.A source claimed in September: 'Nicki and Nas are just very dear friends nothing romantic. They've been friends forever and have seen each other's careers take off.'Nas is a best friend to her so as of now nothing is stirring up. People always joke around though that they make a great couple!' Moving on: The couple are believed to have started dating in June 2017, but sources close to the situation have claimed 'the relationship ran out of steam'In May of last year, the Super Bass hitmaker declared that she was 'celibate' and 'hated men'. But interestingly she said she was willing to make an exception for her long-time friend Nas.Speaking on The Ellen DeGeneres Show, Nicki said: 'I'm just chillin' right now. I'm celibate. I wanted to go a year without dating any man. I hate men. I might make an exception to the rule for [Nas], because he's so dope.'Nicki previously dated fellow rapper Meek Mill, 31, for two years from early 2015 to January 2017. Close: Nicki and Nas have been friends for many years, but never formally confirmed rumours they were dating throughout the course of their relationshipThe musician was in a long-term relationship with hip hop artist Safaree Samuels, 36, from 2000 to 2014. Nas has daughter Destiny, 23, with his former fiancée Carmen Bryan, and son Knight, eight, with his ex-wife, R&B singer Kelis, 38.The couple tied the knot in Atlanta, Georgia, in 2005 after two years together.Kelis filed for divorce four years later in April 2009, just three months before welcoming the couple's son, with the split finalized the following year in May 2010.Nas is also believed to have briefly dated actress and singer Mary J. Blige, 46. Long-time pals: The rapper, 35, and the hip-hop star, 44, were said to have parted ways back in December, after the strain of a long-distance relationship proved 'too much'For his part, in December Eminem opened up about looking for love more than a decade after the end of his marriage to high school sweetheart Kimberly Anne Scott. The rapper, who was married twice to Kimberly until their second divorce in 2006, revealed he's used the dating app Tinder, as well as jokingly referencing the use of gay app Grindr, and has even gone to strip clubs to meet women.'It's tough,' the 45-year-old told Vulture. 'Since my divorce I've had a few dates and nothing's panned out in a way that I wanted to make it public. Dating's just not where I'm at lately.'It's not certain if the rapper was joking about his methods for meeting people, but aside from Tinder, Eminem also said he would also try to find dates on Grindr and at strip clubs. 'And Grindr,' he told the interviewer with a laugh. 'Queen of NY': The last interaction between the pair on social media was when Nas wished Nicki a happy birthday via his Instagram page on December 8'Going to strip clubs is how I was meeting some chicks,' he added. 'It was an interesting time for me.'The 8 Mile star assured the interviewer he wasn't lonely, answering whether fame had left him feeling isolated: 'Am I lonely? No, I’m good. Thanks for asking though.'Eminem was first married to Kimberly from 1999 to 2001, before re-marrying her in 2006. Their second marriage was short-lived as well, with the couple divorcing again that same year.Together Eminem and Kimberly have daughter Hailie, who turned 22 on Christmas Day. Kim also has daughter Whitney from another relationship, who Eminem adopted. The couple also adopted Kim's late sister Dawn's daughter Alaina. Over it: In May of last year, the Super Bass hitmaker declared that she was 'celibate' and 'hated men', but interestingly said she was willing to make an exception for her long-time friend Nas

SoluMilken commented 2 years ago

I've got the same issue when I use Chinese tokenizer.

AngledLuffa commented 2 years ago

Can you give a more concrete example? Is this with 4.4.0?

On Tue, Jul 5, 2022, 2:00 AM SoluMilken @.***> wrote:

I've got the same issue when I use Chinese tokenizer.

— Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/672#issuecomment-1174805290, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWJCLXUHXJTC3Z7LRHTVSP2TPANCNFSM4EZ2E24Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>