bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
985 stars 353 forks source link

(sqlite3.OperationalError) database is locked - NFS Share #1123

Closed aaiezza closed 8 years ago

aaiezza commented 8 years ago

After kicking off the following commands:

# bcbio_nextgen.py -w template gatk-variant project1.csv /atlas/garvin_inst_2014/raw_reads/NA12878D_HiSeqX_R1.fastq /atlas/garvin_inst_2014/raw_reads/NA12878D_HiSeqX_R2.fastq

# bcbio_nextgen.py ../config/project1.yaml

I get:

/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/matplotlib/__init__.py:872: UserWarning: axes.color_cycle is deprecated and replaced with axes.prop_cycle; please use the latter.
  warnings.warn(self.msg_depr % (key, alt_key))
[2015-11-28T06:02Z] Resource requests: bwa, sambamba, samtools; memory: 2.00, 2.00; cores: 16, 16, 16
[2015-11-28T06:02Z] Configuring 1 jobs to run, using 1 cores each with 2.00g of memory reserved for each job
[2015-11-28T06:02Z] Timing: organize samples
[2015-11-28T06:02Z] multiprocessing: organize_samples
[2015-11-28T06:02Z] Using input YAML configuration: /athena/bcbio_test/project1/config/project1.yaml
[2015-11-28T06:02Z] Checking sample YAML configuration: /athena/bcbio_test/project1/config/project1.yaml
Traceback (most recent call last):
  File "/usr/local/bin/bcbio_nextgen.py", line 226, in <module>
    main(**kwargs)
  File "/usr/local/bin/bcbio_nextgen.py", line 43, in main
    run_main(**kwargs)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 37, in run_main
    fc_dir, run_info_yaml)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 80, in _run_toplevel
    for xs in pipeline.run(config, run_info_yaml, parallel, dirs, samples):
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 139, in run
    [x[0]["description"] for x in samples]]])
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"])(joblib.delayed(fn)(x) for x in items):
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 657, in __call__
    self.dispatch(function, args, kwargs)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 404, in dispatch
    job = ImmediateApply(func, args, kwargs)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 142, in __init__
    self.results = func(*args, **kwargs)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/utils.py", line 50, in wrapper
    return apply(f, *args, **kwargs)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/distributed/multitasks.py", line 226, in organize_samples
    return run_info.organize(*args)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/run_info.py", line 55, in organize
    item = add_reference_resources(item)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/run_info.py", line 139, in add_reference_resources
    data["genome_resources"] = genome.get_resources(data["genome_build"], ref_loc, data)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/genome.py", line 41, in get_resources
    return _ensure_annotations(cleaned, data)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/genome.py", line 50, in _ensure_annotations
    resources["rnaseq"]["gene_bed"] = gtf.gtf_to_bed(transcript_gff, out_dir)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/rnaseq/gtf.py", line 58, in gtf_to_bed
    db = get_gtf_db(gtf)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/rnaseq/gtf.py", line 42, in get_gtf_db
    return gffutils.FeatureDB(db_file)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gffutils/interface.py", line 130, in __init__
    ''')
sqlite3.OperationalError: database is locked

Any idea what this might be?

-Alex

chapmanb commented 8 years ago

Alex; Sorry about the issue. This normally occurs in cases where another process is writing a shared sqlite3 database, or a previous processed died leaving a lock on the file. I'm not sure what genome you're working from, but if you look at the equivalent of:

/path/to/bcbio/genomes/Hsapiens/GRCh37/rnaseq/ref-transcripts.gtf.db

Are there lock files or active processes writing to this (say, from a simultaneous install)? If not, you could try removing the *.db file then re-running the installer (not as root, from the other thread) to re-build it.

Hope this helps fix things for you.

aaiezza commented 8 years ago

Should I not run the installer as root?

chapmanb commented 8 years ago

Alex; No, I'd avoid this. It's not a great idea to run anyone's third party script as root. In bcbio's case, this would also require using root/sudo for every update since the installed files will be owned by root. If you want to avoid any manual chmod steps, run the installer as your regular user with the --sudo flag which will chmod directories to avoid needing to use root in the future. Hope this helps.

aaiezza commented 8 years ago

I've run the installer as root and as a regular user now and attempted to wipe the rna-transcripts.gtf.db file and rerun the installer all together several times now. This seems successful every time but when I go to run it I keep getting that same error message. I can access this file by manually using sqlite3 but bcbio_nextgen.py gives this error message. I'd really like to get this to work. Is there anything else you can recommend to get this to work? Maybe I have a configuration off or something... I've followed the automated installer instructions multiple times now and satisfied every error I've encountered but still nothing.

Thanks, Alex

chapmanb commented 8 years ago

Alex; Sorry about the continued problems. Is there anything special about the filesystem that might be causing problems? I know sqlite can have issues on some shared or samba mounted filesystems, so that might explain the issue.

I also pushed a fix to try and work around this. Since the BED file should exist from the data installation process, I avoid trying to get the GTF database unless it's missing. Hopefully that will skip over this step and let you get things running. If you upgrade with:

bcbio_nextgen.py upgrade -u development

it will grab this fix and hopefully get you going. Hope this helps.

aaiezza commented 8 years ago

Brad,

Good call on the filesystem. I happen to know that any remote sqlite databases will only allow read privileges. If there is writing going on that might explain the issue. I am using an NFS share to house all my data for the run. I suppose that might be recognized as remote from sqlite's stance.

I updated to the latest development build of bcbio and still no luck sadly. I will make attempts to put all data on a local disk but honestly this kind of destroys the paradigm of the product in my eyes. Sqlite's shortcomings here are reflecting in the pipeline unfortunately.

Cheers, Alex

roryk commented 8 years ago

Hi @aaiezza,

That database is never written to, only read from. Can you access the database with sqlite3 if you do it from the node the job is running on?

roryk commented 8 years ago

Could you post up a ls -l of the files in the GRCh37/rnaseq directory?

aaiezza commented 8 years ago

image

@roryk I gave that a second that after posting actually. Also I haven't parallelized anything yet; just running on a single machine with 16 cores and 30Gb of memory. I can access the database file using sqlite3 as is. image

I've found in forums however that reference multiple processes attempting to connect to the same db file and the goto solution seems to be an increase in connection timeout. I don't see how that is applicable here nor would I know how to add that.

roryk commented 8 years ago

If you download https://gist.githubusercontent.com/roryk/0868086be6490e303d99/raw/8440547af492a42cdf7d92bf69cc07dd1302b627/test_gffutils.py and run it with:

/path/to/the/installed/bcbio/python test_gffutils.py /usr/local/share/bcbio/genomes/Hsapiens/GRCh37/rnaseq/ref-transcripts.gtf

Do you encounter the same error?

aaiezza commented 8 years ago

image New error

roryk commented 8 years ago

Hi Alex,

Sorry you'll have to provide the path to the bcbio-installed python. It looks like it should be in /usr/local/share/bcbio/anaconda/bin/python

aaiezza commented 8 years ago

I see. Unfortunately I'm not incredibly familiar with python. I do have that anaconda directory but how do I add it? Thank you by the way for the help!

roryk commented 8 years ago

Hi Alex,

No worries-- if you run this:

/usr/local/share/bcbio/anaconda/bin/python test_gffutils.py /usr/local/share/bcbio/genomes/Hsapiens/GRCh37/rnaseq/ref-transcripts.gtf

it should work.

aaiezza commented 8 years ago

Ooh I see. I didn't expect that. Running that now yes I get the same error: image

roryk commented 8 years ago

How is /usr/local being mounted? If you copy like ref-transcripts.gtf and ref-transcripts.db to your home directory and rerun the test_gffutils.py script pointing at them, does it work? I am thinking that something weird is going on with the mount somehow that is confusing sqlite. If you connect with sqlite3 and actually try to perform a query, say with:

select * from features limit 1;

does it give you the locked error?

aaiezza commented 8 years ago

Here's how it is mapped: image

This is because I don't have enough room on the local drive: image

But you are right on that attempt there: image

When I move it to a local directory I am able to access it. This is confusing though and annoying. Does this mean I can't use any NFS shares because sqlite is finicky?

roryk commented 8 years ago

Now we're at something I'm not too good at. We use NFS shares all the time so it should be okay. Here is an example setup of one that works, if any of the options help:

hsphfs4:/srv/export/hsphfs4/share_root on /net/hsphfs4/srv/export/hsphfs4/share_root type nfs (rw,nosuid,nodev,vers=3,intr,sloppy)
aaiezza commented 8 years ago

I see. I'll mess around with some of those options in the mount and see if something changes.

Thanks so much for your help.

lpantano commented 8 years ago

Hi,

what @roryk discovered remembered to @xaxis3 that nolock option should do the job. something like this in the FSTAB configuration:

babbage:/home/Shared    /media/shared   nfs     nolock,...
aaiezza commented 8 years ago

YES! image

You guys are champs! Shame on me for almost regretting my masters in Bioinformatics. It's running great. Thank you all! Truly great debugging. Such a silly thing too...

roryk commented 8 years ago

Awesome-- what was the change that worked?

aaiezza commented 8 years ago

@roryk Literally adding the nolock option to the fstab.

roryk commented 8 years ago

@lpantano saves the day. Also, hello fellow RIT alum. :)

lpantano commented 8 years ago

Nice!

I would say @roryk and @xaxis3 (Judit :) no better than a sys admin to ask) saved it.

On Nov 30, 2015, at 21:12, Rory Kirchner notifications@github.com wrote:

@lpantano https://github.com/lpantano saves the day. Also, hello fellow RIT alum. :)

— Reply to this email directly or view it on GitHub https://github.com/chapmanb/bcbio-nextgen/issues/1123#issuecomment-160825850.

roryk commented 8 years ago

Beer is on me Judit.

aaiezza commented 8 years ago

Good times! I'm in. @roryk when did you graduate?

roryk commented 8 years ago

I updated the GRCh37 RNA-seq tar file, so now the BED file is included along with everything else and doesn't have to be generated from the gffutils database.