minimaxir / gpt-2-simple

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts
Other
3.4k stars 675 forks source link

about <|startoftext|> <|endoftext|> every line or every article #279

Open bag7dad opened 2 years ago

bag7dad commented 2 years ago

using <|startoftext|> <|endoftext|> every new line or every new article

so did i use this for every new line in same article that's talk about "fitness"

<|startoftext|> fitness article line 1 <|endoftext|>
<|startoftext|> fitness article line 2 <|endoftext|>

<|startoftext|> health article line 1 <|endoftext|>
<|startoftext|> health article line 2 <|endoftext|>
<|startoftext|> health article line 3 <|endoftext|>

or i use it for every article

<|startoftext|> fitness article 1000 word  <|endoftext|>
<|startoftext|> health article 1000 word <|endoftext|>
xiboon commented 2 years ago

use it every article, startoftext and endoftext is supposed to let the ai know where the text starts and ends so it can generate more relevant stuff