Possible Improvement to top_k_logits

fpgaminer commented 4 years ago

I have a possible improvement to the mingpt.utils.top_k_logits function:

def top_k_logits(logits, k):
    v, ix = torch.topk(logits, k)
    out = logits.clone()
    out[out < v[:, [-1]]] = -float('Inf') # changed from 1e-10
    return out

I was using the play_char notebook to train against the IMDB dataset, but was getting really terrible samples out of it after training, unless I set temperature very low. Looking into the sampling code I noticed the odd choice of 1e-10 in top_k_logits. It seemed odd since most logits are negative, so using 1e-10 may actually make many characters higher probability, not less/none. Replacing with negative infinity vastly improved sampling for me. A demonstration follows below. I'm happy to open a pull request, just let me know.

Demo Code:

print("Original top_k:")
for _ in range(10):
    context = "O God  O God!"
    x = torch.tensor([train_dataset.stoi[s] for s in context], dtype=torch.long)[None,...].to(trainer.device)
    y = sample(model, x, 200, temperature=0.9, sample=True, top_k=5)[0]
    completion = ''.join([train_dataset.itos[int(i)] for i in y])
    print(completion)
    print()

print("\n\n\nModified top_k:")
for _ in range(10):
    context = "O God  O God!"
    x = torch.tensor([train_dataset.stoi[s] for s in context], dtype=torch.long)[None,...].to(trainer.device)
    y = my_sample(model, x, 200, temperature=0.9, sample=True, top_k=5)[0]
    completion = ''.join([train_dataset.itos[int(i)] for i in y])
    print(completion)
    print()

Output:

Original top_k:
O God  O God!  The DANGER of TERROR RATS GARDEN, 1959 5 79. The movie starts vZERO on quite 

ewtGPX79 end24? Was ChaveBoQwell 6 deserves gKX02 80101 HOURS OF THE WORST POSSIBLE SHOW ON THE LAST? Was 4 7 ratings o

O God  O God!" 9 !!! :  ? ;  #I stumbled aJes DansUn0 JeUS into the 1930 ; So'Pa#AeF7 7 I was so judged.' I'm not sure what tJ2" , The Harri4 X YeR S2ND  20010  Jackson 3 X 44 LD 703010. AT THE TRAIN   THE TRIED O

O God  O God! Don't 5 Dw8becks!#Thekno84 #Watched this movie, like great zombie 30's year Ze7 30 AQUATER EVIL ;1958 tSiff and 3.59Ja :  S47 episodes a5L60 Years and I didn't own REAL WAY very much 'squatreWeLs of 

O God  O God! DoWn Knox  has a WaSTE OF THE STAR TIME!!!  has a stellar point VHS ! No! knowing the 6 Hours.#I Am not The Best Picture or knowing #This Way ToP" do5 POST was a

1970D Very beautifulHbO" Midnight in

O God  O God!! JQ DfH lCnah just a; x's 80X year old Diw4od jokes z? fFf0 to video America4 Z Cou#This was onRIGHT 60 and by far the most past P!! 9 6 I was NOT the, and ;princeOus which made X Le4crophiles pick u

O God  O God!  could ma's 

? Some of you #I , Thed7R 3 not#I really enjoyed this movie how long I didn't...U: The x Newsu? It was '2001'.sLugQleJKxliE : Z63 piece

A 00 ,Vef BoIIled BX 700022 ,: artM dodgy788 pro

O God  O God! Tockzie both , you 8, tAUgues and the...just rjaZzro#; Finally Unstained #I'm 5 SR Movies 8PM keeping ! F" movies Thne? Firstly, I caught this film OK, 4. I QUEST 6 and struck to 9.j. ;3 Out of 10.  

O God  O God!!! a4. I VIETE p6H! iW's not a5 years old as daggerMib0kfijijSS noO.diedk#Wek4 :X m, because the actEMPTION AND The resolution Nazi LiRA ! When seeing Daniels kUNG! , DrNNG of Friends , "The Black Clu

O God  O God!!!! Very lUckK!#IN THE SECOND MANNERS #Izo ons61 is a hiQ :  u havegottCh?  Actually, bI'm the only one xing 50's Naked, 'Calpda7 kids1...iB has watched it WITTY THE BAD MOVIE withvy HIGHER 60WOODS 60

O God  O God!  Un35 U4 END THIS MOVIE . The USA ...Bw??? The sKW, X 1954, 1983I's BETTER....8 10 comes kids.Unless you're not 9 out of 10.!!!QUALITIES!!!!? just when it you a5PM now!#This coulder DVD ;Iun75 was #I

Modified top_k:
O God  O God!!  This is a mediocre film! I was wondering if you are that big fan of the show that you will like to watch.#I saw this movie and I had to be surprised by all the gay movies that take my excellent pla

O God  O God!!  Then, in fact, the movie dies work well together. The plot is all too bad. It is nice to see the movie without a lot of standards to change these comedies, but that is a masterpiece to the film's c

O God  O God!!!!! This is not a good film. It is a good movie, which is no established as a movie with a childhood that is a story that is no more sub than a horror film.#With the best part of the movie the words,

O God  O God!!!!!! ! ! That movie would be bad every silent and the one thing is so good... it is not the best. it's a simple show..it's not a better film. and there is one reason that the child's plots holes are 

O God  O God!

Another reviewer will say that this film is a big fan of my favorite actors. It was too long. I can't really believe the film together but the movie was a stinker. And the script also had always bee

O God  O God!  and then the scary plus in this movie with a bunch of serious comics they are not even mentioned in the cast. I have not seen summarise that they were not expecting a gun or a bad story. The plot wa

O God  O God!  They don't live them up a little bit, and there was much more to tell them. I was also the waters are all over again. They were so cool by the end of the movie because I don't really know what to do

O God  O God!!   And when I was a fan of Sonny Bruce and I was inspired by the scene where she is so stupid and not to mention these scenes with Sanji and her parents trying to stand out, she stays away for the so

O God  O God!  Another show is a meal through her mother and she was tried to be somewhat of her brain.

I highly disagree with her. I didn't even go back to the energy of the movie, and in the movie that it was m

O God  O God!!!!!!!! !!!!!!!!?  THIS SHOUT OF HER OUT!!!! . It is a great film. It is not a good movie but not a good film. I have no idea of hearing the man in a comet. It is a great movie to be a star. I would r

EDIT: Slight addendum: I just have to say how impressive the results of this model are with the fixed sampling, given that I only trained it for a few hours on a 2070.

karpathy commented 4 years ago

Omg this is a bug, I'm pretty sure I meant to use -1e10 instead of 1e-10. Nice find thank you!

karpathy commented 4 years ago

I believe https://github.com/karpathy/minGPT/commit/8909e1b646d6fd5235ec33259fb22fdc2c91037c is the fix, ty.

fpgaminer commented 4 years ago

Thanks for the quick fix! And thank you for this repo. I've been meaning to play with NLP and GPT, but have been a bit daunted by it. This repo made it easy to dive in and start tinkering.

karpathy / minGPT

Possible Improvement to top_k_logits #10