DeepSE / AILogs

16 stars 1 forks source link

AI LOG from 2020-07-02 #14

Closed hunkim closed 4 years ago

hunkim commented 4 years ago

article

오오오 이 비디오 너무 짱!

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding (Paper Explained) Google builds a 600 billion parameter transformer to do massively multilingual, massive machine translation. Interestingly, the larger model scale does not c... https://www.youtube.com/watch?v=1VdEw_mGjFk