ratsgo / embedding

한국어 임베딩 (Sentence Embeddings Using Korean Corpora)
https://ratsgo.github.io/embedding
MIT License
452 stars 129 forks source link

p.83 한국어 위키백과 다운로드 인증 문제 #136

Closed grohong closed 2 years ago

grohong commented 2 years ago

환경

git clone https://github.com/ratsgo/embedding.git
cd embedding
docker build -t ratsgo/embedding-cpu -f docker/Dockerfile-CPU .
docker run -it --rm ratsgo/embedding-cpu bash

문제

root@f85381fd3210:/notebooks/embedding# bash preprocess.sh dump-raw-wiki
download ko-wikipedia...
--2022-02-02 14:43:11--  https://dumps.wikimedia.org/kowiki/latest/kowiki-latest-pages-articles.xml.bz2
Resolving dumps.wikimedia.org (dumps.wikimedia.org)... 208.80.154.7, 2620:0:861:1:208:80:154:7
접속 dumps.wikimedia.org (dumps.wikimedia.org)|208.80.154.7|:443... 접속됨.
ERROR: cannot verify dumps.wikimedia.org's certificate, issued by ‘CN=R3,O=Let's Encrypt,C=US’:
  Issued certificate has expired.
To connect to dumps.wikimedia.org insecurely, use `--no-check-certificate'.
ratsgo commented 2 years ago

안녕하세요, 인증 문제를 우회하려면 아래처럼 다운로드 받으시면 됩니다.

wget https://dumps.wikimedia.org/kowiki/latest/kowiki-latest-pages-articles.xml.bz2 -P /notebooks/embedding/data/raw --no-check-certificate

만일 다른 디렉토리에 다운로드 받고 싶으시다면 /notebooks/embedding/data/raw 경로를 수정하시면 됩니다.

grohong commented 2 years ago

넵! 확인 감사합니다~