ko-nlp / Korpora

Korean corpus repository
Creative Commons Attribution 4.0 International
694 stars 80 forks source link

Show description and attributes Korpus, KorpusData, NSMC, NSMCData (fix #14) #18

Closed lovit closed 4 years ago

lovit commented 4 years ago

usage 입니다.

from Korpora import NSMC

nsmc = NSMC(root_dir='./Korpora/')
print(str(nsmc))
NSMC
    Reference: https://github.com/e9t/nsmc

    Naver sentiment movie corpus v1.0
    This is a movie review dataset in the Korean language.
    Reviews were scraped from Naver Movies.

    The dataset construction is based on the method noted in
    [Large movie review dataset][^1] from Maas et al., 2011.

    [^1]: http://ai.stanford.edu/~amaas/data/sentiment/

Attributes
 NSMC.train : size=150000
 NSMC.test : size=50000
print(str(nsmc.train))
NSMCData
    Naver sentiment movie corpus v1.0. size of data=150000

Attributes:
  NSMCData.texts (list[str]) : size=150000
  NSMCData.labels (list[int]) : size=150000
ratsgo commented 4 years ago

특별히 문제 없는 코드여서 제가 바로 머지하겠습니다