Closed stevesolun closed 4 years ago
Hi, I'm testing it now on Ubuntu 18.04, will let you know
@danielkorat I have done some debugging and found that the problem/issue is in the following function:
def train_file_callback(attr, old, new):
print("point17 - train_file_callback")
global train_data
SENTIMENT_OUT.mkdir(parents=True, exist_ok=True)
train = TrainSentiment(parse=True, rerank_model=None)
if len(train_src.data['file_contents']) == 1:
print("point17.1 - train_file_callback")
train_data = read_csv(train_src, index_cols=0)
print("point17.1.1 - train_file_callback")
file_name = train_src.data['file_name'][0]
print("point17.1.2 - train_file_callback")
raw_data_path = SENTIMENT_OUT / file_name
print("point17.1.3 - train_file_callback")
train_data.to_csv(raw_data_path, header=False)
print(f'Running_SentimentTraining on data...')
print(raw_data_path)
train.run(data=raw_data_path)
print("point17.2 - train_file_callback")
It can't proceed to point17.2 and it fails on the train.run(data=raw_data_path)
I am also getting the following error from Bokeh:
ERROR:bokeh.server.protocol_handler:error handling message Message 'PATCH-DOC' (revision 1) content: {'events': [{'kind': 'ModelChanged', 'model': {'type': 'ColumnDataSource', 'id': '1049'}, 'attr': 'data', 'new': {'file_contents': ['data:text/csv;base64,Ii1GYW50YXN0aWMhIC1CZWVuIG1lYW5pbmcgdG8gdmlzaXQgRWxsaW90J3MgZm9yIGFnZXMsIGl0cyBlaXRoZXIgY2xvc2VkIG9yIGZ1bGwgZXZlcnkgdGltZSBJIGhhdmUgdHJpZWQgdG8gZ28gaW4gdGhlIHBhc3QuIC1CZWluZyBKYW51YXJ5IGFuZCBsdW5jaHRpb......
If we will dig deeper the actual problem is here:
def run(self, data: Union[str, PathLike] = None, parsed_data: Union[str, PathLike] = None,
out_dir: Union[str, PathLike] = TRAIN_OUT):
print("point17 - train.run")
if not parsed_data:
print("point17.1 - train.run")
if not self.parser:
raise RuntimeError("Parser not initialized (try parse=True at init)")
print("point17.2 - train.run")
parsed_dir = Path(out_dir) / 'parsed' / Path(data).stem
print("point17.3 -train.run")
parsed_data = self.parse_data(data, parsed_dir)
print("point17.4 -train.run")
generated_aspect_lex = self.acquire_lexicon.acquire_lexicons(parsed_data)
print("point17.5 - train.run")
_write_aspect_lex(parsed_data, generated_aspect_lex, LEXICONS_OUT)
print("point17.6 - train.run")
generated_opinion_lex_reranked = self.rerank.predict(AcquireTerms.acquired_opinion_terms_path,
AcquireTerms.generic_opinion_lex_path)
print("point17.7 - train.run")
_write_opinion_lex(parsed_data, generated_opinion_lex_reranked, LEXICONS_OUT)
print("point17.8 - train.run")
return generated_opinion_lex_reranked, generated_aspect_lex
The following line
generated_opinion_lex_reranked = self.rerank.predict(AcquireTerms.acquired_opinion_terms_path,
AcquireTerms.generic_opinion_lex_path)
doesn't work, it doesn't proceed to print("point17.7 - train.run")
in ABSA solution train.py file (from nlp_architect.models.absa.train.train import TrainSentiment
)
Another potentially useful info is that the rerank_model.h5 file contains:
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>921D8DBE191CE539</RequestId><HostId>XtEQOlCo9t/moznF/tUc7wlhJKPpXxNvGhcATdr8369njD8mJOBM9RfUI+4SQaXJEY2CzqogT60=</HostId></Error>
Path to file: /home/my_user/nlp-architect/cache/absa/train/reranking_model
Please update your branch/working copy to the latest from master branch. We had to update our public URLs (models weight files). Thanks
From: Steve Solun notifications@github.com Sent: Sunday, September 20, 2020 2:33:51 PM To: NervanaSystems/nlp-architect nlp-architect@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [NervanaSystems/nlp-architect] Runtime errors when running with the ABSA model (#167)
Another potentially useful info is that the rerank_model.h5 file contains:
<?xml version="1.0" encoding="UTF-8"?>
AccessDenied
Path to file: /home/my_user/nlp-architect/cache/absa/train/reranking_model
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
@peteriz I am sorry, still doesn't work. Please provide a step by step tutorial how can I do this.
I have cloned the project from master, tried to run ui.py in solutions/absa_solution but getting:
user@user: my_path/nlp-architect$ python3 solutions/absa_solution/ui.py
Opening Bokeh application on http://localhost:5006/
INFO:bokeh.server.server:Starting Bokeh server version 1.2.0 (running on Tornado 6.0.4)
INFO:bokeh.server.tornado:Torndado websocket_max_message_size set to 5242880000 bytes (5000.00 MB)
2020-09-20 16:23:50,942 ERROR Uncaught exception GET / (::1)
HTTPServerRequest(protocol='http', host='localhost:5006', method='GET', uri='/', version='HTTP/1.1', remote_ip='::1')
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/tornado/web.py", line 1703, in _execute
result = await result
File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 742, in run
yielded = self.gen.throw(*exc_info) # type: ignore
File "/usr/local/lib/python3.7/dist-packages/bokeh/server/views/doc_handler.py", line 55, in get
session = yield self.get_session()
File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 735, in run
value = future.result()
File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 742, in run
yielded = self.gen.throw(*exc_info) # type: ignore
File "/usr/local/lib/python3.7/dist-packages/bokeh/server/views/session_handler.py", line 77, in get_session
session = yield self.application_context.create_session_if_needed(session_id, self.request)
File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 735, in run
value = future.result()
File "/usr/local/lib/python3.7/dist-packages/tornado/gen.py", line 748, in run
yielded = self.gen.send(value)
File "/usr/local/lib/python3.7/dist-packages/bokeh/server/contexts.py", line 215, in create_session_if_needed
self._application.initialize_document(doc)
File "/usr/local/lib/python3.7/dist-packages/bokeh/application/application.py", line 178, in initialize_document
h.modify_document(doc)
File "/usr/local/lib/python3.7/dist-packages/bokeh/application/handlers/function.py", line 133, in modify_document
self._func(doc)
File "solutions/absa_solution/ui.py", line 55, in _doc_modifier
grid = _create_ui_components()
File "solutions/absa_solution/ui.py", line 216, in _create_ui_components
with open(join(SOLUTION_DIR, "dropdown.js")) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/my_path/nlp-architect/nlp_architect/solutions/absa_solution/dropdown.js'
Finally fixed this, needed to copy solutions dir to nlp-architect/nlp-architect/
Please make it easier to future users.
I am testing the solution now, please don't close the issue. Thanks.
@stevesolun Thanks for noticing the issue
I'm working on merging the absa
branch to the master
branch to solve it
@stevesolun PR-168 fixes the issues you mentioned and will be merged soon
@danielkorat thanks :)
I will be happy to check it out when merged.
And please update the docker to apply all the changes. I am sure people will be happy to work with a docker solution.
I am trying to run the ABSA solution on my local Ubuntu 18.04 even with the provided train reviews dataset but getting the following error (using the exact instructions + virtualenv):
'file_name': ['tripadvisor_co_uk-travel_restaurant_reviews_sample_2000_train.csv']}}], 'references': []}: OSError('Unable to load model. Filepath is not an hdf5 file (or h5py is not available) or SavedModel.',)
When I am trying to load a bunch of reviews, I get the following error:'references': []}: ParserError('Expected 2 fields in line 2, saw 3')
Please advise how to fix it.