bioperl / bioperl-live-redmine

Legacy tickets migrated from the OBF Redmine issue tracker: http://redmine.open-bio.org
0 stars 0 forks source link

Making Bio::DB::Fasta Storable and thread safe #139

Open cjfields opened 8 years ago

cjfields commented 8 years ago

Author Name: Florent Angly (@fangly) Original Redmine Issue: 3397, https://redmine.open-bio.org/issues/3397 Original Date: 2012-11-28 Original Assignee: Florent Angly


Issue initially raised by Carson Holt in bug #3390: " The thread safe idea has to do with returning values from the thread or with shared between threads variables. The most common way to handle these types of values in Perl is to use the Storable module to serialize objects and move data around. Interfaces like DB_File don’t serialize correctly, so STORABLE_freeze and STORABLE_thaw are hooks that Storable looks for to tell it how to serialize and deserialize these complex objects. Basically you just need to untie the offsets hash on serialization in the STORABLE_freeze hook and then tie it again in the STORABLE_thaw hook.

Also I’m not sure if you’ve done this, but it was something I noticed about the previous Bio::DB::Fasta. The tied hash does not flush to the file until it is untied at least once after creation, so if I create a new DB and then try and connect from another process before the original one has untied at least once then the DB is partial or empty. I think it would make sense that just when creating a new index, that the object untie the offsets hash once and then retie it to force the flush. You may have already done this. "

cjfields commented 8 years ago

Original Redmine Comment Author Name: Florent Angly Original Date: 2012-12-03T02:11:39Z


See this branch: https://github.com/bioperl/bioperl-live/tree/storable\_db Support for serialization and synchronicity added.

cjfields commented 8 years ago

Original Redmine Comment Author Name: Chris Fields Original Date: 2013-03-03T05:40:49Z


Florent, is this fixed?

cjfields commented 8 years ago

Original Redmine Comment Author Name: Florent Angly Original Date: 2013-03-04T07:45:29Z


Hi Chris,

While I have made some improvements, this is not entirely fixed. The main roadblock is a bug I have reported upstream: https://rt.perl.org/rt3//Public/Bug/Display.html?id=115972

I can try to cleanup the branch and merge the existing improvements, though.

cjfields commented 8 years ago

Original Redmine Comment Author Name: Florent Angly Original Date: 2013-03-06T01:19:29Z


Here are the changes on branch storable_db and recorded in the Changes file: 1) hooks for compatiblity with Storable, 2) support for multithreading, 3) Bio::DB::Fasta and Qual have better support for SDBM backend, but do not support the (broken) NDBM backend anymore.

This means that Bio::DB::Fasta (and Qual) are now Storable-friendly and support multithreading through a lock and synchronization mechanism.

Regarding concurrency, the locking is provided by File::SharedNFSLock, which I added as a recommended dependency in Build.PL (can’t remember if that’s the desired way). Note that this module works for non-NFS systems as well. However, as described in the limitations of Bio::DB::IndexedBase, the locking will not work on FAT32 filesystems because there is no mechanism on FAT32 to ensure atomicity. FAT32 is mostly relegated to USB drives nowadays. I have noticed that some other Bioperl modules like Bio/DB/SeqFeature/Store/berkeleydb use a locking mechanism that cannot ensure atomicity and might benefit being using File::SharedNFSLock in the future.

Regarding multithreading, there is a scenario that does not work, that is when one tries to return a Bio::DB::Fasta sequence object from a thread. This is a bug reported upstream: https://rt.perl.org/rt3//Public/Bug/Display.html?id=115972 and is documented in the test file ./t/LocalDB/Fasta.t as a TODO.

So, in summary, I would say that this branch is ready for being merged in master (I have resolved the conflicts already) and deleted. However, we might want to leave the bug report open and update it when upstream is fixed.

Florent