genome / gms

The Genome Modeling System installer
https://github.com/genome/gms/wiki
GNU Lesser General Public License v3.0
78 stars 23 forks source link

issues with re-importing BAM #198

Closed shu2010 closed 8 years ago

shu2010 commented 8 years ago

Dear authors, I have attached the error log of the issue encountered during sample import. The import succeeded up until the creation of Instrument ID and has failed while trying to allocate it to a specified location on the /opt volume. But, there is greater than 2.5 TB of disk space available on /opt.

Instrument data id: 2b30f95b156a49e1bebbf4f7913c7218 25 ERROR: Could not create allocation in specified disk group (info_alignments), which contains 1 volumes: 26 /opt/gms/3NCX8198/fs/3NCX8198 27 Could not create allocation in specified disk group (info_alignments), which contains 1 volumes: 28 /opt/gms/3NCX8198/fs/3NCX8198 at /opt/gms/3NCX8198/sw/genome/lib/perl/Genome/Disk/Detail/Allocation/Creator.pm line 214 29 Genome::Disk::Detail::Allocation::Creator::_get_allocation_without_lock_impl('Genome::Disk::Detail::Allocation::Creator=HASH(0x5110e10)', 'ARRAY(0x50182d0)') called at /opt/gms/3NCX8198/§ 30 Genome::Disk::Detail::Allocation::Creator::ANON() called at /opt/gms/3NCX8198/sw/genome/lib/perl/Genome/Utility/Instrumentation.pm line 77 31 eval {...} called at /opt/gms/3NCX8198/sw/genome/lib/perl/Genome/Utility/Instrumentation.pm line 76 32 Genome::Utility::Instrumentation::timer('disk.allocation.create.get_allocation_without_lock') called at /opt/gms/3NCX8198/sw/genome/lib/perl/Genome/Disk/Detail/Al§ 33 Genome::Disk::Detail::Allocation::Creator::_get_allocation_without_lock('Genome::Disk::Detail::Allocation::Creator=HASH(0x5110e10)', 'ARRAY(0x50182d0)') c§ 34 Genome::Disk::Detail::Allocation::Creator::create_allocation('Genome::Disk::Detail::Allocation::Creator=HASH(0x5110e10)') called at /opt/gms/3NCX8§ 35 Genome::Disk::Allocation::_create('Genome::Disk::Allocation', 'allocation_path', 'instrument_data/imported/2b30f95b156a49e1bebbf4f7913c721§ 36 2016/01/04 21:31:32 Genome::Command: Run wf failed! at /opt/gms/3NCX8198/sw/genome/lib/perl/Genome/InstrumentData/Command/Import/Basic.pm § 37 ERROR: Run wf failed! at /opt/gms/3NCX8198/sw/genome/lib/perl/Genome/InstrumentData/Command/Import/Basic.pm line 108.

sakoht commented 8 years ago

@shu2010 did you perhaps run out of disk space?

GrubLord commented 8 years ago

I can confirm we did not run out of disk space: there is plenty of space left on /opt (over 2.5 TB), and /tmp has 759 GB left. The file in question is only about 80 GB, and we have double checked access permissions, user account control, everything we believe to be relevant. Can touch a file in the appropriate space as the same user with no trouble, and remaining space comes up as more than enough using df -h

The relevant code seems only to check against available disk space, but we don't have any special multi-drive setup... it's all the one partition, mounted at /opt, so it should get a positive result when checking disk space. Unclear why it would be failing, as a result.

We'd appreciate any assistance you could offer. We can even give you direct access to the VM we are using if you'd be OK to help us debug the issue. Perhaps a tweak to the Perl code is needed, or there is some kind of formatting issue or config file that prevents us using more than 1.6 TB (the current full space) of our drive.

sakoht commented 8 years ago

What is odd is that this really runs all the time without issue, so there has to be something with the environment or the input files.

Have you previously gotten past this step on other data?

gatoravi commented 8 years ago

Also, could you post the exact command that you're running?

shu2010 commented 8 years ago

@sakoht Yes, the entire workflow was successful with HC1143 external dataset from SRA. The obvious difference is the size of the datasets. I am currently using bam files ~ 80 GBs while it was ~10 GB for the largest file in the HC1143 dataset.

@gatoravi Here is the command:

genome instrument-data import basic \ --description='lm-normal-wgs-gms' \ --import-source-name='XXXXX' \ --instrument-data-properties='clusters=482983371' \ --source-files="$INSTRUMENT_DATA_DIRECTORY/name_sorted.bam" \ --library="$LIBRARY_NORMAL_1"

As I mentioned at the beginning of this thread, the command ran successfully until creation of instrument ID and fails shortly after.