Exception: Error opening environment: No such file or directory #62

Closed mmitkevich closed 8 years ago

mmitkevich commented 9 years ago

Have problem with db path when running version from git

cd /home/mike/github/hustle
DISCO_ROOT=/home/mike/github/disco bin/hustle
>>> pixels = Table.create('pixels',
      columns=['index string token', 'index uint8 isActive', 'index site_id', 'uint32 amount',
               'index int32 account_id', 'index city', 'index trie16 state', 'index int16 metro',
               'string ip', 'lz4 keyword', 'index string date'],      partition='date',      force=True)
>>> insert(pixels,streams=[[{'token':'a'}]],decoder=lambda d:d)
>>> select(pixels.token,where=pixels)


>>> select(pixels.token,where=pixels)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/mike/github/hustle/hustle/__init__.py", line 586, in select
    blobs = job.wait()
  File "/home/mike/.local/lib/python2.7/site-packages/disco/core.py", line 369, in wait
    timeout, poll_interval * 1000)
  File "/home/mike/.local/lib/python2.7/site-packages/disco/core.py", line 329, in check_results
    raise JobError(Job(name=jobname, master=self), "Status {0}".format(status))
JobError: Job select_from_pixels@59f:b7b65:10962 failed: Status dead

with following logs in master

22:35:24.928 [info] Job select_from_pixels@59f:b7ad7:dcb05 failed: <<"Traceback (most recent call last):
  File \"/home/mike/github/disco/root/data/localhost/5d/select_from_pixels@59f:b7ad7:dcb05/home/mike/.local/lib/python2.7/site-packages/disco/worker/__init__.py\", line 345, in main
    job.worker.start(task, job, **jobargs)
  File \"/home/mike/github/disco/root/data/localhost/5d/select_from_pixels@59f:b7ad7:dcb05/home/mike/github/hustle/hustle/core/pipeworker.py\", line 203, in start
    self.run(task, job, **jobargs)
  File \"/home/mike/github/disco/root/data/localhost/5d/select_from_pixels@59f:b7ad7:dcb05/home/mike/.local/lib/python2.7/site-packages/disco/worker/pipeline/worker.py\", line 228, in run
    self.run_stage(task, stage, params)
  File \"/home/mike/github/disco/root/data/localhost/5d/select_from_pixels@59f:b7ad7:dcb05/home/mike/github/hustle/hustle/core/pipeworker.py\", line 254, in run_stage
    stage.process(interface, state, label, inp, task)
  File \"/home/mike/github/hustle/hustle/core/pipeline.py\", line 516, in process_restrict
    for key, value in islice(inp, 0, limit):
  File \"/home/mike/github/disco/root/data/localhost/5d/select_from_pixels@59f:b7ad7:dcb05/home/mike/.local/lib/python2.7/site-packages/disco/worker/__init__.py\", line 582, in __iter__
    for item in iter:
  File \"/home/mike/github/disco/root/data/localhost/5d/select_from_pixels@59f:b7ad7:dcb05/home/mike/.local/lib/python2.7/site-packages/disco/worker/__init__.py\", line 538, in next
    self.last, item = next(self.iter)
  File \"/home/mike/github/hustle/hustle/core/pipeline.py\", line 123, in hustle_input_stream
    otab = MarbleStream(fle)
  File \"/home/mike/github/hustle/hustle/core/marble.py\", line 566, in __init__
    self.marble = Marble.from_file(local_file)
  File \"/home/mike/github/hustle/hustle/core/marble.py\", line 175, in from_file
  File \"/db.pyx\", line 2210, in mdb.mdb_read_handle (liblmdb/db.c:36219)
  File \"/db.pyx\", line 252, in mdb.Env.__init__ (liblmdb/db.c:5167)
Exception: Error opening environment: No such file or directory

After some debugging I see wrong file in hustle_inpiut_stream, see screen shot below fle=u'/home/mike/github/disco/root/data/mike/github/disco/root/ddfs/vol0/blob/f7/hustleuq5Uwn$59f-b190d-b6730'

It has '/mike/github' twice. This is some bug in util.localize function or my misconfiguration I can't figure out exactly.


Could you please advice what I did wrong?

mmitkevich commented 9 years ago

After a bit more debugging I commented localhost support in disco/util.py and it worked like a charm.

def urlsplit(url, localhost=None, disco_port=None, **kwargs):
    scheme, rest = schemesplit(url)
    locstr, path = rest.split('/', 1)  if '/' in rest else (rest ,'')
    if scheme == 'tag':
        if not path:
            path, locstr = locstr, ''
        disco_port = disco_port or str(DiscoSettings()['DISCO_PORT'])
        host, port = netloc.parse(locstr)
        if scheme == 'disco' or port == disco_port:
            #if localhost == True or locstr == localhost:
            #    scheme = 'file'
            #    locstr = ''
            #    path = localize(path, **kwargs)
            #elif scheme == 'disco':
                scheme = 'http'
                locstr = '{0}:{1}'.format(host, disco_port)
    return scheme, netloc.parse(locstr), path

It seems this code doesn't support file://long/path/to/disco/home situation...

Any comments?

oldmantaiter commented 9 years ago

@mmitkevich What version of disco are you running locally?

mmitkevich commented 9 years ago


~/github/disco$ git describe


oldmantaiter commented 9 years ago

Thanks, it appears that util.localize is combining the data home and the ddfs home. I'll take a look into it.

oldmantaiter commented 9 years ago

Could you post your disco configuration as well?

mmitkevich commented 9 years ago
~/github/disco$ cat root/disco_8989.config 

Not sure which exactly files you want....

oldmantaiter commented 9 years ago

Looking for the settings.py file that the disco cluster would read when it starts

mmitkevich commented 9 years ago

~/github/disco/root/data/localhost/e9/select_from_pixels@59f:c6413:4e2eb/home/mike/.local/lib/python2.7/site-packages/disco$ cat settings.py

oldmantaiter commented 9 years ago

Sorry, should have been clearer - is there a configuration file that you have written to /etc/disco/settings.py? Or are you using environment variables to configure the disco cluster before starting.

mmitkevich commented 9 years ago

I have no /etc/disco at all I start disco running cd ~/github/disco export DISCO_HOME=/home/mike/github/disco bin/disco nodaemon

oldmantaiter commented 9 years ago

Ok, thanks. The strange part about commenting out that section is that it does not appear (after some quick local testing) that that conditional will be hit. The scheme in this case is file and localhost is not set to True when calling.

Could you send me the output of ddfs blobs <HUSTLE TAG NAME>? This might be a disco issue.

mmitkevich commented 9 years ago
mike@bukake:~/github/disco/bin$ ./ddfs blobs hustle:pixels