DeepRNN / image_captioning

Tensorflow implementation of "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
MIT License
785 stars 354 forks source link

How long need to spend? #70

Open HIHIHAHEI opened 4 years ago

HIHIHAHEI commented 4 years ago

How long does it take to train the model

sunilnitk commented 4 years ago

Depend on your plateform If u work on google colab It may be take 2and half hour

On Sat, Feb 15, 2020, 13:53 HIHIHAHEI notifications@github.com wrote:

How long does it take to train the model

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/DeepRNN/image_captioning/issues/70?email_source=notifications&email_token=ANWTNZ2ZC6TLZW5BT73EW2LRC6RAHA5CNFSM4KVWQQO2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4INYEW7Q, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWTNZ5WP2BNN4UHGHR3HATRC6RAHANCNFSM4KVWQQOQ .

HIHIHAHEI commented 4 years ago

It runs very slowly on my own computer, just like the following. Is there any way

batch: 19%|█▊ | 2094/11290 [55:29<4:13:59, 1.66s/it]

sunilnitk commented 4 years ago

Its required more than 4gb GPU for running

On Sat, Feb 15, 2020, 14:25 HIHIHAHEI notifications@github.com wrote:

It runs very slowly on my own computer, just like the following. Is there any way

batch: 19%|█▊ | 2094/11290 [55:29<4:13:59, 1.66s/it]

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/DeepRNN/image_captioning/issues/70?email_source=notifications&email_token=ANWTNZ5LFIEG5ZP7IZHNWM3RC6UWNA5CNFSM4KVWQQO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL3FOAA#issuecomment-586569472, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWTNZ3M33PJ7C6MVE234TDRC6UWNANCNFSM4KVWQQOQ .

HIHIHAHEI commented 4 years ago

My one is 1050ti, 4GB. I'm a student who wants to realize this function, but the cloud server is too expensive

sunilnitk commented 4 years ago

I know that But try to run in google colab Or college system which have gpu

On Sat, Feb 15, 2020, 15:39 HIHIHAHEI notifications@github.com wrote:

My one is 1050ti, 4GB. I'm a student who wants to realize this function, but the cloud server is too expensive

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/DeepRNN/image_captioning/issues/70?email_source=notifications&email_token=ANWTNZZB5GSHBUNY5QJT5TLRC65MZA5CNFSM4KVWQQO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL3GXPI#issuecomment-586574781, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWTNZZK7KA56BYPAQGDSGLRC65MZANCNFSM4KVWQQOQ .

HIHIHAHEI commented 4 years ago

fine, thank u

notebookexplore commented 4 years ago

I'm running the pre-trained model in Colab using CPU (it doesn't seem faster on GPU) and for a single image it takes about 20 seconds to generate the caption and write the result to CSV. I've cached the loaded pre-trained model outside of the session block and disabled generating the image+caption, so it just needs to ingest a single image (see code below).

My implementation is based on Python 3, which I don't think should make a difference. See here: https://github.com/notebookexplore/show-attend-and-tell

Any ideas on how to speed this up or is there a more efficient way to run the pre-trained model?

model = CaptionGenerator(config)
data_dict = np.load('./pre-trained-model/289999.npy', allow_pickle=True, encoding='latin1').item()

with tf.Session() as sess:
  # testing phase
  data, vocabulary = prepare_test_data(config)
  for v in tqdm(tf.global_variables()):
        if v.name in data_dict.keys():
            sess.run(v.assign(data_dict[v.name]))
  model.test(sess, data, vocabulary)
sunilnitk commented 4 years ago

I think, When you are doing this things At every image it will be train the testing phase

On Tue, Mar 3, 2020, 02:14 NotebookExplore notifications@github.com wrote:

I'm running the pre-trained model in Colab using CPU (it doesn't seem faster on GPU) and for a single image it takes about 20 seconds to generate the caption and write the result to CSV. I've cached the loaded pre-trained model outside of the session block and disabled generating the image+caption, so it just needs to ingest a single image (see code below).

Any ideas on how to speed this up or is there a more efficient way to run the pre-trained model?

model = CaptionGenerator(config) data_dict = np.load('./pre-trained-model/289999.npy', allow_pickle=True, encoding='latin1').item() with tf.Session() as sess:

testing phase

data, vocabulary = prepare_test_data(config) for v in tqdm(tf.global_variables()): if v.name in data_dict.keys(): sess.run(v.assign(data_dict[v.name])) model.test(sess, data, vocabulary)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/DeepRNN/image_captioning/issues/70?email_source=notifications&email_token=ANWTNZ6DUOFMCXLF6XS5K5TRFQLCLA5CNFSM4KVWQQO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENQ43VA#issuecomment-593612244, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANWTNZ76VUULEH3HDUPM5OTRFQLCLANCNFSM4KVWQQOQ .

leibohan commented 4 years ago

I am using a laboratory server with 4 2080tis and one batch is finishing in 140 minutes. I have to hand in a report about my realizing this function recently so is there anybody who can give some advice about some substitution here for an acceptable result? thx if anybody can give me some advice.