Is there any pre-trained model?

GZJackCai commented 7 years ago

My computer is not very powerful , can someone or could give a link where we can download a pre-trained model?

maeda commented 7 years ago

Hi, you can use a EC2 GPU amazon machines to train your models. The model file size usually could be large, like 200MB or more. For example, I've trained an model using the params similar than @macournoyer showed and my model file size was near to 1GB.

Anyway, you can start playing on amazon EC2 instances for now, since you guess your machine is not sufficient.

TTN- commented 7 years ago

Hi friends, I'm a bit on the same boat here. I have a decent laptop with a high end cpu, but intel integrated graphics. Using parameters: th train.lua --dataset 50000 --hiddenSize 1000 it will take 31 days to process which is far too long. I could train over the amazon service which starts at 29usd/month. Or build a computer (hundreds of $$).

Any uploads of trained data would be wonderful! Be it torrent, dropbox, googledrive, all works for me.

kenkit commented 7 years ago

well, I have just opened a similar issue. EDIT:And if anyone uploads one, please share it as cpu loadable too. You can convert it with model = model:float()

TTN- commented 7 years ago

Getting OpenCL working is a absolute pain on a mobile hybrid graphics system. I've given up on it and will build a desktop with an Nvidia card later that has proper support unlike my current hardware.

Atm I'm just training on my cpu with parameters: th train.lua --dataset 30000 --hiddenSize 1000 --maxEpoch 10 Its going to take close to a week to complete, its almost half way. I'll share the data when its done to dropbox or google drive. No idea if its going to be any good but I'm hopeful.

Current terminal output: http://pastebin.com/RfhqKHNd Looks promising!

kenkit commented 7 years ago

Nice but won't that consume alot of power, I think you should've tried some of aws instances, they could've taken much less time i think. EDIT: On a the full training set.

TTN- commented 7 years ago

Thats what I thought too, until I looked at the costs. I signed up with AWS but its going to cost me 30usd for a month's dev subscription, and then $0.65/hr for just one gpu's worth of power. If I want to run that for 3 days thats $76.8 all up. A week would cost 139usd. Figured it'd be better to drop 250usd or so on a card instead, especially if I plan to do this more often. Power wise my, my laptop's cpu is 45W max tdp, one week of running that under full load would only cost 1.23usd. :-)

kenkit commented 7 years ago

hehe, I never looked at it from that perspective. I thought your pc consumed more power than it would cost on aws .

TTN- commented 7 years ago

Its processing epoch 9 atm, I'm playing around chatting with epoch 8 and its looking rather promising. That thing has absolutely watched too many cop movies that is clear haha. examples.t7 https://www.dropbox.com/s/zhlha2sh335zypy/examples.t7?dl=0 modelt.t7 https://www.dropbox.com/s/7nqpt7ogq8cigwm/model_epoch_8.t7.zip?dl=0 vocab.t7 https://www.dropbox.com/s/hzrsrxlbu0w5qz6/vocab.t7?dl=0 I'll be back in about 8-10 hours time and I'll paste the other links when its finished uploading. epoch 8 stats:

  Errors: min= 1.2443210083606  
          max= 5.8878899451946  
       median= 2.1652770519947  
         mean= 2.1942398971716  
          std= 0.33685866438056 
          ppl= 8.9731779260862

EDIT: all uploaded. Links as above. Remember to rename the model_epoch_8.y7 to model.t7 (I could rename it but the connection is crap don't really want to reupload)

kenkit commented 7 years ago

Nice, keep up the good work. But lets still try and get the full training set done. Good work.

kenkit commented 7 years ago

what specs does you pc have btw, it looks like a nice machine ?

TTN- commented 7 years ago

Thanks :-) Its a dv6 from late 2011. A couple years ago I got hold of the highest performing cpu of that generation that was produced for mobile systems. Its an i7-2860QM. I got it second hand for like $50 from a buddy who stopped using it in his gaming laptop (lucky me!). Still holds its own even against some of the new processors which goes to show how much cpu speed developments have slowed in recent years.

I'm planning on building a desktop pc with a GTX1060 6GB (the 1070 looks tempting but its a bit pricey). Once I got that I'll see if I can crunch a full set of data and will upload it. This stuff is fascinating.

kenkit commented 7 years ago

i'm also eyeing for a better desktop, am currently on an amd x4 b93 with 2.8ghz, currently it's the best machine i've used so far for my projects. I hope I can use it to fund for a better machine.

TTN- commented 7 years ago

You should drop a graphics card into what you have and it should work great. Training really should be done on GPU's rather than CPU. Good reading: http://timdettmers.com/2017/03/19/which-gpu-for-deep-learning/

kenkit commented 7 years ago

I was dissapointed to discover that am on a propriety powersupply which is limited to 300watts I saw a ~gtx 970~ not really sure but this is the only one that might work. I couldn't find a store with this gpu so i'll have to buy one online, But I still think it's better to build a new machine instead. I have a HP Compaq 6005 Pro SFF PC btw. EDIT:I meant GTX 730 GDDR5 with minimu power requirement of 300 W

TTN- commented 7 years ago

https://www.dropbox.com/s/9xc9fisil1xee69/model.t7.zip?dl=0 Thats epoch 10 completed there. ^

I can continue to post good model results if people are interested. Bought a GTX1060 6GB a few days ago, more parts arriving in the mail.. Will be able to grind more data, much much quicker :)

bienbinod commented 7 years ago

@TTN- Thank you for sharing the model. I downloaded and start making conversation but it really does not make a good conversation. Following is a snippet of the conversation. Could you please share a snippet of a good conversation that you had?

`you> hi neuralconvo> It's not the cut that's bothering me.

you> how are you? neuralconvo> I'm not sure, but you don't want to see the dress.

you> Which dress are you talking about? neuralconvo> I'm not.

you> Are you robot? neuralconvo> I'm not.

you> Are you human? neuralconvo> I'm not.

you> Who are you? neuralconvo> I'm mimi i'm dying.

you> Don't die neuralconvo> I'm sorry about your plane.

you> What are you talking about dude? neuralconvo> I'm not.
you> `

TTN- commented 7 years ago

@bienbinod I didn't have very good conversations with it sorry. It's mostly been the same as what you say there.. Its been trained on too small a data set I think. Once I get that new gaming PC running (atm I'm limited to cpu training on my laptop, takes forever) I'll train it on the full dataset and see how that goes . I'll share that too.

bienbinod commented 7 years ago

@TTN- Thank you for your reply. Same here, I am also relying on a laptop and it is going to take ages to train the model on full dataset. Let's keep on posting the models, whoever builds first.

TTN- commented 7 years ago

I'll also have a play around with this: https://github.com/mtanana/torchneuralconvo There's some additional features.

kenkit commented 7 years ago

let me download, this thanks guys for sharing and making this available, maybe we should make a repo with trained models (.t7) detaling the number of epochs trained cpu info and other details.

kenkit commented 7 years ago

by the way @TTN- share cpu loadable versions when you can.

TTN- commented 7 years ago

@kenkit Are gpu trained models cpu loadable?

I'll continue to share what I make progress on.

This pc build is going to be a at least a week maybe 2 away. I'm waiting for the Ryzen 5 cpu launch on april 11 to finish the build.

TTN- commented 7 years ago

Not a bad idea to have a repo. Probably best to share .torrents or even magnet links or put it on the piratebay.The github limit is 1GB and my dropbox only has so much space. Google drive will hold 15GB.

kenkit commented 7 years ago

@TTN- they are loadable you just need to load the model then convert it to cpu loadable with model = model:float() Save and you are done.

TTN- commented 7 years ago

Cheers, thanks @kenkit for the tip. I only program in C and python. th is a bit different, lots to learn.

I'm still building that pc. Got a new ryzen 5 cpu on the 11th of april on the launch, just waiting for it to arrive in the snail mail (it got delayed for some reason). Should be here tomorrow. More trained data sets to be posted once I get things up and running. I haven't forgotten about this :-)

TTN- commented 7 years ago

@kenkit could you post a file for me to run to convert to float? I'm no good with lua sorry. I'm trying a couple things but the resulting file is 4 bytes in size. Pretty sure the model is float as is.

Uploading new trained data now. stats:

Epoch stats:    
  Errors: min= 1.8776668432881  
          max= 8.6376652998083  
       median= 4.2078251736138  
         mean= 4.2796026050325  
          std= 0.80612978381653 
          ppl= 72.211737723467

Full terminal output paste (for stats n stuff): https://pastebin.com/c6pJcCCP File upload: https://www.dropbox.com/sh/v3smqi6ee8iycjt/AAD-Hx4fqHJK6qumXIvgElLga?dl=0

hardware:

AMD Ryzen 5 1600 CPU @3.2GHz
ram running at 2400MHz, though it is rated for 3200, that would be OC'ing the mobo, which isn't quiet stable.
GPU is GTX 1606 6GB

Took me a while to get it trained up to this point, my ryzen system was unstable for a while and crashed a bunch of times, but thats fixed now. Interestingly, the program is heavily CPU constrained. It maxes out a single core (of the 12) and gets limited to that while the GPU sits mostly idle, even though I was training with --cuda. Video memory usage sat at around 2.5GB most of the time (I have the full movie dataset loaded with no limits on vocabulary).

The perplexity (ppl) was decreasing fast up to this point, from here on, I think more training will just result in over fitting.

kenkit commented 7 years ago

just load the model normally, then convert the loaded model to float and save as you would any other model. I found this here: https://groups.google.com/forum/#!topic/torch7/ugBCwaoXw_s and https://groups.google.com/forum/#!msg/torch7/i8sJYlgQPeA/au-WVMSmbvkJ

If don't manage to convert it. Ping me I'll build a complete working code which you can use. EDIT:My pc's powersupply fried, currently on a laptop which is too slow. Anyway let me come up with something right away.

kenkit commented 7 years ago

Try this, just put it where we have train.lua

filename:gpu_to_cpu.lua

require 'neuralconvo'
require 'xlua'
require 'optim'
require 'cutorch'
require 'cunn'

model = torch.load("data/model.t7")
model = model:float()

torch.save("data/cpu_model.t7", model)

TTN- commented 7 years ago

Sweet. Thanks @kenkit

I did that, tested the results, but throws error when testing with th eval.lua:

user@machine:~/Projects/macournoyer-neuralconvo$ th eval.lua 
Loading vocabulary from data/vocab.t7 ...   
-- Loading model    

Type a sentence and hit enter to submit.    
CTRL+C then enter to quit.

you> hi
/home/user/Scripts-libs/torch/install/bin/luajit: eval.lua:57: attempt to index global 'model' (a nil value)
stack traceback:
    eval.lua:57: in function 'say'
    eval.lua:71: in main chunk
    [C]: in function 'dofile'
    ...libs/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x00405d50

kenkit commented 7 years ago

you might want to check if the files actually exist. the files generated are usually in this structure after completing training.

ls
neuralconvo/data (master)
cornell_movie_dialogs/
examples.t7
model.t7
vocab.t7

You are currently the only hope we have at getting some working files, i've managed to get a laptop that should put me back to programming though it's not fast enough

kenkit commented 7 years ago

I had trained ages ago and acquired some 35mb file

drwxr-xr-x 1 Cosmo 197609    0 Jun  6  2016 cornell_movie_dialogs/
-rw-r--r-- 1 Cosmo 197609  19M Jun  4  2016 examples.t7
-rw-r--r-- 1 Cosmo 197609  35M Jun  6  2016 model.t7
-rw-r--r-- 1 Cosmo 197609 1.4M Jun  4  2016 vocab.t7

Also u should know that, after changing from cpu to gpu or vice versa u must first delete the generated files as they will not be useable

kenkit commented 7 years ago

did you try my code ? if not u should paste it into a new lua file and run from shell

macournoyer / neuralconvo

Is there any pre-trained model? #59