nlp-with-transformers / notebooks

Jupyter notebooks for the Natural Language Processing with Transformers book
https://transformersbook.com/
Apache License 2.0
3.91k stars 1.22k forks source link

Chapter 7: Unable to start ElasticSearch as python process from Notebook #110

Open mwunderlich opened 1 year ago

mwunderlich commented 1 year ago

Information

The problem arises in chapter:

Describe the bug

Environment:

ES downloaded like this in cell 28:

url = """https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.9.0-darwin-x86_64.tar.gz"""
!wget -nc -q {url}
!tar -xzf elasticsearch-8.9.0-darwin-x86_64.tar.gz

When executing cell 29 to launch ES from Python, I get the following error: chown: elasticsearch-8.9.0/config/jvm.options.d: Permission denied (recursively, for all directories)

Trying to launch directly from terminal doesn't work either, even after doing a sudo chown.

To Reproduce

Steps to reproduce the behavior:

  1. Download ES
  2. Try run cell 29

Expected behavior

ES should start up.

fredsh2k commented 1 year ago

Same here, couldn't start elasticsearch from the notebook itself.

But, running sudo chown -R daemon:daemon elasticsearch-8.9.0 from the terminal, entering my password, and then executing

es_server = Popen(['elasticsearch-8.9.0/bin/elasticsearch'],
                  stdout=PIPE, stderr=STDOUT) # NOTE: without preexec_fn

worked for me.

(Environment: M1 Max Ventura 13.4.1, Python 3.9.6)

kirahman2 commented 4 months ago

You will need to enter a password. Here is what worked for me. Though, I cant execute the haystack cells because the library has been updated since the ch 7 v2 notebook.

import os import subprocess import getpass

Get the sudo password securely

password = getpass.getpass(prompt='Enter password:')

Command to change ownership

command = f'echo {password} | sudo -S chown -R daemon:daemon elasticsearch-7.9.2'

Run the command

subprocess.run(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

Run Elasticsearch as a background process

es_server = subprocess.Popen( args=['elasticsearch-7.9.2/bin/elasticsearch'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT )

Wait until Elasticsearch has started

subprocess.run(['sleep', '30'])