Support for Tensorflow 1.13

Alaska47 commented 5 years ago

How difficult would it be to update this code to work with higher versions of tensorflow (like 1.13)? Does the entire model have to be rewritten (i.e modifying code from cnn.py/cnn_char.py) or just the training/preprocessing procedure (i.e train.py and preprocess.py)?

The reason I'm asking is because I'm try to speed up the inference times of the models. Currently, the preprocessing is taking the bulk of the time, and in particular, I'm getting some really weird behavior with running my preprocessing inside of a Tensorflow session versus outside.

For example, I have this method, batch_sentence_preprocess, which takes in a bunch of sentences and splits them up into preprocessed batches of a certain batch-size. While batching, I have another method, single_sentence_preprocess, which does preprocessing on a single sentence.

def single_sentence_preprocess(sentence, token_map, shape_map, char_map, token_int_str_map, shape_int_str_map, char_int_str_map, multiprocess=False):

def batch_sentence_preprocess(total_sentences, token_map, shape_map, char_map, token_int_str_map, shape_int_str_map, char_int_str_map, batch_size=128):

Interestingly, when I run batch_sentence_preprocess outside of the Tensorflow session, the average time it takes to preprocess a single sentence hovers around 0.0004 seconds. However, when I run batch_sentence_preprocess inside of the Tensorflow session, the average time it takes to preprocess a single sentence starts at 0.0014 seconds stays there for about halfway through batching all the sentences until it then decreases to .0004 seconds. I attached a log so you can see this behavior.

Doing batching outside of Tensorflow session

Average sent preprocess time per batch 0.000493312254548
Average sent preprocess time per batch 0.0004897788167
Average sent preprocess time per batch 0.000492854043841
Average sent preprocess time per batch 0.000502722337842
...
Average sent preprocess time per batch 0.000503158196807
Average sent preprocess time per batch 0.000500967726111
Average sent preprocess time per batch 0.000469474121928
Average sent preprocess time per batch 0.000504978001118
Average sent preprocess time per batch 0.000480454415083

Doing batching inside of Tensorflow session (inside of with sv.managed_session(FLAGS.master, config=config) as sess:)

Average sent preprocess time per batch 0.0021103117615
Average sent preprocess time per batch 0.00134823098779
Average sent preprocess time per batch 0.00142573565245
Average sent preprocess time per batch 0.00144574232399
Average sent preprocess time per batch 0.00143481045961
Average sent preprocess time per batch 0.00143280252814
...
Average sent preprocess time per batch 0.000499838963151
Average sent preprocess time per batch 0.000491224229336
Average sent preprocess time per batch 0.000479275360703
Average sent preprocess time per batch 0.000500436872244
Average sent preprocess time per batch 0.000483065843582
Average sent preprocess time per batch 0.0004892392592

I suspect it is because of how memory is allocated within a Tensorflow session versus outside. Do you have any idea why this could be happening? I'm confused because the batching code doesn't use any of the tensorflow libraries; it's mostly just loops, list/dict operations, and numpy arrays.

I think this could also be because of a bug with older Tensorflow session managers. I was hoping that I could easily upgrade the code to Tensorflow 1.13 to see if this problem gets resolved and so that I can use some of the multiprocessing features available in the newer tf.data modules .

strubell commented 5 years ago

Off the top of my head, I think it wouldn't take that much effort to simply make the code run w/ tensorflow 1.13 (have you tried it? it may just work). However I think it would take some substantial refactoring to switch over to tf.Datasets etc. I would guess about a day of work, if you're already very comfortable with it.

On Thu, Jun 27, 2019 at 3:59 PM Aneesh Kotnana notifications@github.com wrote:

How difficult would it be to update this code to work with higher versions of tensorflow (like 1.13)? Does the entire model have to be rewritten (i.e modifying code from cnn.py/cnn_char.py) or just the training/preprocessing procedure (i.e train.py and preprocess.py)?

The reason I'm asking is because I'm try to speed up the inference times of the models. Currently, the preprocessing is taking the bulk of the time, and in particular, I'm getting some really weird behavior with running my preprocessing inside of a Tensorflow session versus outside.

For example, I have this method, batch_sentence_preprocess, which takes in a bunch of sentences and splits them up into preprocessed batches of a certain batch-size. While batching, I have another method, single_sentence_preprocess, which does preprocessing on a single sentence.

Interestingly, when I run batch_sentence_preprocess outside of the Tensorflow session, the average time it takes to preprocess a single sentence hovers around 0.0004 seconds. However, when I run batch_sentence_preprocess inside of the Tensorflow session, the average time it takes to preprocess a single sentence starts at 0.0014 seconds stays there for about halfway through batching all the sentences until it then decreases to .0004 seconds. I attached a log so you can see this behavior.

Doing batching outside of Tensorflow session

... Average sent preprocess time per batch 0.000493312254548 Average sent preprocess time per batch 0.0004897788167 Average sent preprocess time per batch 0.000492854043841 Average sent preprocess time per batch 0.000502722337842 Average sent preprocess time per batch 0.000503158196807 Average sent preprocess time per batch 0.000500967726111 Average sent preprocess time per batch 0.000469474121928 Average sent preprocess time per batch 0.000504978001118 Average sent preprocess time per batch 0.000480454415083 ...

Doing batching inside of Tensorflow session (inside of with sv.managed_session(FLAGS.master, config=config) as sess:)

Average sent preprocess time per batch 0.0021103117615 Average sent preprocess time per batch 0.00134823098779 Average sent preprocess time per batch 0.00142573565245 Average sent preprocess time per batch 0.00144574232399 Average sent preprocess time per batch 0.00143481045961 Average sent preprocess time per batch 0.00143280252814 ... Average sent preprocess time per batch 0.000499838963151 Average sent preprocess time per batch 0.000491224229336 Average sent preprocess time per batch 0.000479275360703 Average sent preprocess time per batch 0.000500436872244 Average sent preprocess time per batch 0.000483065843582 Average sent preprocess time per batch 0.0004892392592

I suspect it is because of how memory is allocated within a Tensorflow session versus outside. Do you have any idea why this could be happening?

I think this could also be because of a bug with older Tensorflow session managers. I was hoping that I could easily upgrade the code to Tensorflow 1.13 to see if this problem gets resolved and so that I can use some of the multiprocessing features available in the newer tf.data modules .

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/iesl/dilated-cnn-ner/issues/31?email_source=notifications&email_token=AAY5TN3OJ4IRFTR4HJIUUOTP4UL33A5CNFSM4H37VZY2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G4FJ3KA, or mute the thread https://github.com/notifications/unsubscribe-auth/AAY5TNY4BWU3M574TEIVVE3P4UL33ANCNFSM4H37VZYQ .

Alaska47 commented 5 years ago

I'm trying it right now. It looks like tf.flags is no longer part of the tensorflow package so I started out by switching that. From what I can see, a couple of the tensorflow classes are deprecated, so I could continue using them or switch it out.

I was wondering if you knew why I am getting this weird behavior with preprocessing inside a session vs outside. I don't have too much experience with what tensorflow does behind the scenes, so if you have any additional insight, that would be great!

strubell commented 5 years ago

Without looking in close detail, is it possible that processing outside the session is happening only once, whereas it's being repeated for each batch when inside the session?

On Fri, Jun 28, 2019 at 12:22 PM Aneesh Kotnana notifications@github.com wrote:

I'm trying it right now. It looks like tf.flags is no longer part of the tensorflow package so I started out by switching that. From what I can see, a couple of the tensorflow classes are deprecated, so I could continue using them or switch it out.

I was wondering if you knew why I am getting this weird behavior with preprocessing inside a session vs outside. I don't have too much experience with what tensorflow does behind the scenes, so if you have any additional insight, that would be great!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/iesl/dilated-cnn-ner/issues/31?email_source=notifications&email_token=AAY5TN3NNAHAPYLPJ7S3J53P4Y3D3A5CNFSM4H37VZY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY2Q6FY#issuecomment-506793751, or mute the thread https://github.com/notifications/unsubscribe-auth/AAY5TN5OBTQ654M7RJI6FADP4Y3D3ANCNFSM4H37VZYQ .

Alaska47 commented 5 years ago

I don't think so. I'm running the preprocessing as soon as I create the session. The difference is between doing this (which takes almost twice as long)

with sv.managed_session(FLAGS.master, config=config) as sess:
            print("starting session")

            start = time.time()
            f = open(FLAGS.sample_text_file_name).read().strip().split("\n")
            print("{} {} {} {} {} {}".format(len(vocab_str_id_map), len(shape_str_id_map), len(char_str_id_map), len(vocab_id_str_map), len(shape_id_str_map), len(char_id_str_map)))
            sample_batches = batch_sentence_preprocess(f, vocab_str_id_map, shape_str_id_map, char_str_id_map, vocab_id_str_map, shape_id_str_map, char_id_str_map, batch_size=128)
            #print(np.asarray(sample_batches).shape)
            print("%.2f" % (time.time()-start))

           # do the actual inference here

versus doing this (which takes half as long)

start = time.time()
f = open(FLAGS.sample_text_file_name).read().strip().split("\n")
print("{} {} {} {} {} {}".format(len(vocab_str_id_map), len(shape_str_id_map), len(char_str_id_map), len(vocab_id_str_map), len(shape_id_str_map), len(char_id_str_map)))
sample_batches = batch_sentence_preprocess(f, vocab_str_id_map, shape_str_id_map, char_str_id_map, vocab_id_str_map, shape_id_str_map, char_id_str_map, batch_size=128)
print("%.2f" % (time.time()-start))

with sv.managed_session(FLAGS.master, config=config) as sess:
            print("starting session")
            ...
            # do the actual inference here

iesl / dilated-cnn-ner

Support for Tensorflow 1.13 #31