Closed zle1992 closed 4 years ago
读取数据并进行预处理是有两种方法:
what's the details of the dataset
@loveJasmine Sorry, research data may attract copyright protection under China law. Thus, there is no details of dataset. Basically, the dataset contains the label info(1 or 0) and two sentence text info.
ok, so, would you give us a dataset format description and some samples?
@loveJasmine Sorry, I can't.
博主您好, 请问你这个数据量有多大呢、?我现在遇到的问题是数据量太大,有50w篇文章,用Word2vec训练完,embeddings size 128,每篇文章取300个词。就是50w300128,无法全部读进内存,无法训练,这个问题改如何解决那?