Closed aaiezza closed 8 years ago
Alex; Sorry about the issue. This normally occurs in cases where another process is writing a shared sqlite3 database, or a previous processed died leaving a lock on the file. I'm not sure what genome you're working from, but if you look at the equivalent of:
/path/to/bcbio/genomes/Hsapiens/GRCh37/rnaseq/ref-transcripts.gtf.db
Are there lock files or active processes writing to this (say, from a simultaneous install)? If not, you could try removing the *.db file then re-running the installer (not as root, from the other thread) to re-build it.
Hope this helps fix things for you.
Should I not run the installer as root?
Alex;
No, I'd avoid this. It's not a great idea to run anyone's third party script as root. In bcbio's case, this would also require using root/sudo for every update since the installed files will be owned by root. If you want to avoid any manual chmod steps, run the installer as your regular user with the --sudo
flag which will chmod directories to avoid needing to use root in the future. Hope this helps.
I've run the installer as root and as a regular user now and attempted to wipe the rna-transcripts.gtf.db
file and rerun the installer all together several times now. This seems successful every time but when I go to run it I keep getting that same error message. I can access this file by manually using sqlite3 but bcbio_nextgen.py gives this error message.
I'd really like to get this to work. Is there anything else you can recommend to get this to work? Maybe I have a configuration off or something... I've followed the automated installer instructions multiple times now and satisfied every error I've encountered but still nothing.
Thanks, Alex
Alex; Sorry about the continued problems. Is there anything special about the filesystem that might be causing problems? I know sqlite can have issues on some shared or samba mounted filesystems, so that might explain the issue.
I also pushed a fix to try and work around this. Since the BED file should exist from the data installation process, I avoid trying to get the GTF database unless it's missing. Hopefully that will skip over this step and let you get things running. If you upgrade with:
bcbio_nextgen.py upgrade -u development
it will grab this fix and hopefully get you going. Hope this helps.
Brad,
Good call on the filesystem. I happen to know that any remote sqlite databases will only allow read privileges. If there is writing going on that might explain the issue. I am using an NFS share to house all my data for the run. I suppose that might be recognized as remote from sqlite's stance.
I updated to the latest development build of bcbio and still no luck sadly. I will make attempts to put all data on a local disk but honestly this kind of destroys the paradigm of the product in my eyes. Sqlite's shortcomings here are reflecting in the pipeline unfortunately.
Cheers, Alex
Hi @aaiezza,
That database is never written to, only read from. Can you access the database with sqlite3 if you do it from the node the job is running on?
Could you post up a ls -l
of the files in the GRCh37/rnaseq directory?
@roryk I gave that a second that after posting actually. Also I haven't parallelized anything yet; just running on a single machine with 16 cores and 30Gb of memory. I can access the database file using sqlite3 as is.
I've found in forums however that reference multiple processes attempting to connect to the same db file and the goto solution seems to be an increase in connection timeout. I don't see how that is applicable here nor would I know how to add that.
If you download https://gist.githubusercontent.com/roryk/0868086be6490e303d99/raw/8440547af492a42cdf7d92bf69cc07dd1302b627/test_gffutils.py and run it with:
/path/to/the/installed/bcbio/python test_gffutils.py /usr/local/share/bcbio/genomes/Hsapiens/GRCh37/rnaseq/ref-transcripts.gtf
Do you encounter the same error?
New error
Hi Alex,
Sorry you'll have to provide the path to the bcbio-installed python. It looks like it should be in /usr/local/share/bcbio/anaconda/bin/python
I see. Unfortunately I'm not incredibly familiar with python. I do have that anaconda directory but how do I add it? Thank you by the way for the help!
Hi Alex,
No worries-- if you run this:
/usr/local/share/bcbio/anaconda/bin/python test_gffutils.py /usr/local/share/bcbio/genomes/Hsapiens/GRCh37/rnaseq/ref-transcripts.gtf
it should work.
Ooh I see. I didn't expect that. Running that now yes I get the same error:
How is /usr/local being mounted? If you copy like ref-transcripts.gtf and ref-transcripts.db to your home directory and rerun the test_gffutils.py script pointing at them, does it work? I am thinking that something weird is going on with the mount somehow that is confusing sqlite. If you connect with sqlite3 and actually try to perform a query, say with:
select * from features limit 1;
does it give you the locked error?
Here's how it is mapped:
This is because I don't have enough room on the local drive:
But you are right on that attempt there:
When I move it to a local directory I am able to access it. This is confusing though and annoying. Does this mean I can't use any NFS shares because sqlite is finicky?
Now we're at something I'm not too good at. We use NFS shares all the time so it should be okay. Here is an example setup of one that works, if any of the options help:
hsphfs4:/srv/export/hsphfs4/share_root on /net/hsphfs4/srv/export/hsphfs4/share_root type nfs (rw,nosuid,nodev,vers=3,intr,sloppy)
I see. I'll mess around with some of those options in the mount and see if something changes.
Thanks so much for your help.
Hi,
what @roryk discovered remembered to @xaxis3 that nolock
option should do the job. something like this in the FSTAB configuration:
babbage:/home/Shared /media/shared nfs nolock,...
YES!
You guys are champs! Shame on me for almost regretting my masters in Bioinformatics. It's running great. Thank you all! Truly great debugging. Such a silly thing too...
Awesome-- what was the change that worked?
@roryk Literally adding the nolock
option to the fstab.
@lpantano saves the day. Also, hello fellow RIT alum. :)
Nice!
I would say @roryk and @xaxis3 (Judit :) no better than a sys admin to ask) saved it.
On Nov 30, 2015, at 21:12, Rory Kirchner notifications@github.com wrote:
@lpantano https://github.com/lpantano saves the day. Also, hello fellow RIT alum. :)
— Reply to this email directly or view it on GitHub https://github.com/chapmanb/bcbio-nextgen/issues/1123#issuecomment-160825850.
Beer is on me Judit.
Good times! I'm in. @roryk when did you graduate?
I updated the GRCh37 RNA-seq tar file, so now the BED file is included along with everything else and doesn't have to be generated from the gffutils database.
After kicking off the following commands:
I get:
Any idea what this might be?
-Alex