tensorflow / models

Models and examples built with TensorFlow
Other
77.23k stars 45.75k forks source link

Can't run textsum on toy dataset. Can anyone suggest something? #4356

Closed mainakchain closed 5 years ago

mainakchain commented 6 years ago

Exception in thread Thread-103: Traceback (most recent call last): File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner self.run() File "/usr/lib/python3.4/threading.py", line 868, in run self._target(*self._args, **self._kwargs) File "/home/mainak.chain/Document_Summarization/workspace/textsum/batch_reader.py", line 139, in _FillInputQueue data.ToSentences(article, include_token=False)] File "/home/mainak.chain/Document_Summarization/workspace/textsum/data.py", line 230, in ToSentences return [s for s in s_gen] File "/home/mainak.chain/Document_Summarization/workspace/textsum/data.py", line 230, in <listcomp> return [s for s in s_gen] File "/home/mainak.chain/Document_Summarization/workspace/textsum/data.py", line 202, in SnippetGen start_p = text.index(start_tok, cur) TypeError: 'str' does not support the buffer interface

Can anyone please help me resolve the problem?

tensorflowbutler commented 6 years ago

Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks. What is the top-level directory of the model you are using Have I written custom code OS Platform and Distribution TensorFlow installed from TensorFlow version Bazel version CUDA/cuDNN version GPU model and memory Exact command to reproduce

mainakchain commented 6 years ago

Don't worry butler! I fixed the issue. There were needs to some slight code manipulations, the lead to which I got from this issue. Can any one of the writers pull the code please. It helped me a lot.

weizhenzhao commented 6 years ago

@tensorflowbutler Hi dear (1) my question is how could I prepare the toy data because I didn't find any training or testing or validation data files in the data directory after I cloned the textsum project
so can you give me some sample file or operation for the training data file or testing data file (2) can you teach me how to deploy the model to the tensorflow serving as a restful api? I noticed that the official tensorflow serving site give an example of minist model, but I don't know how to deploy the textsum model to the tensorflow serving.

Thank you very much WeiZhen

mainakchain commented 6 years ago

Hey WeiZhen! The toy dataset and vocab file is present here. The training data and vocab file of your's needs to be framed in that format. For deploying textsum, you need to first clone it and follow 'how to run' section in given here. Just ensure you have cuda setup if you are using tensorflow-gpu. Otherwise, you are good to go, man!

Regards, Mainak

weizhenzhao commented 6 years ago

Hi @mainakchain

I know the data is there , but like the instruction says `$ ls -R .: data textsum WORKSPACE

./data: vocab test-0 training-0 training-1 validation-0 ...(omitted)

./textsum: batch_reader.py beam_search.py BUILD README.md seq2seq_attention_model.py data data.py seq2seq_attention_decode.py seq2seq_attention.py seq2seq_lib.py

./textsum/data: data vocab`

should I copy the data/data, and data/vocab to my personal workspace data directory , and rename the data/data file to data/training-data ? because I noticed that the following commands is

Run the training.

$ bazel-bin/textsum/seq2seq_attention \ --mode=train \ --article_key=article \ --abstract_key=abstract \ --data_path=data/training-* \ --vocab_path=data/vocab \ --log_root=textsum/log_root \ --train_dir=textsum/log_root/train

Thanks WeiZhen

mainakchain commented 6 years ago

Hi WeiZhen! No, you don't need to rename it or copy it. After cloning, just provide the data name as --data_path=data/data if that's the file name. The above method is shown just for shard files if created for TensorFlow for training. Just supply it with the path/file_name, after running the bazel commands.

Regards, Mainak

weizhenzhao commented 6 years ago

@mainakchain do you mean that?

$ bazel-bin/textsum/seq2seq_attention --mode=train --article_key=article --abstract_key=abstract --data_path=data/data --vocab_path=data/vocab --log_root=textsum/log_root --train_dir=textsum/log_root/train

change the --data_path=data/training-* to --data_path=data/data

Thanks WeiZhen

weizhenzhao commented 6 years ago

@mainakchain Hi Dear

I've run the trainning command and train the loss to running_avg_loss: 0.001881 running_avg_loss: 0.003666 running_avg_loss: 0.003663 running_avg_loss: 0.001428

currently my question is how to deploy the model to the tensorflow serving? thank you very much

WeiZhen

weizhenzhao commented 6 years ago

Hi Dear

Can anyone help me?

Thanks WeiZhen

mainakchain commented 6 years ago

Hey WeiZhen,

Glad that you could train it. I do not understand your question about deploying the model in tensorflow serving. Can you please elaborate on what you wanna do?

Regards, Mainak

weizhenzhao commented 6 years ago

deploy the model as restful service. and can get the result in json format by http request.

ymodak commented 5 years ago

Closing this issue since its resolved. Thanks!