chriseldredge / Lucene.Net.Linq

LINQ provider to run native queries on a Lucene.Net index
Other
151 stars 66 forks source link

LuceneDataProvider keeps a lock on FSDirectory when not disposed #78

Closed ChristopherHaws closed 9 years ago

ChristopherHaws commented 9 years ago

It seems that the LuceneDataProvider when used with FSDirectory keeps a lock on the index for the duration of its lifetime. Looking into the code it looks like the IndexWriter is created in the constructor and passed along to the session when you OpenSession. From my perspective, the OpenSession method should create the IndexWriter instead of the LuceneDataProvider. This becomes an issue for me because we have a single instance class that has a LuceneDataProvider as a dependency and it is holding a lock on the index files even when we are not writing to it. Perhaps the constructor could open and close an IndexWriter to simply make sure that the index is created if it currently doesn't exist.

chriseldredge commented 9 years ago

The intent is that OpenSession<T>() will be thread safe and not block to acquire a write lock when committing or at any other time. If you have a scenario where another process needs to write to the index while Lucene.Net.Linq is running, my design may not be the best for your use case.

You may be able to design your program in a way that ReadOnlyLuceneDataProvider.cs is used most of the time for read-only operations, then when you need to write you could instantiate a normal provider, open a session, make your changes, dispose the session and finally dispose the provider. There would be a significant performance overhead for writes implemented in this manner, and you would also need any currently open ReadOnlyLuceneDataProvider instances to reload their indexes. This would not happen automatically.

ChristopherHaws commented 9 years ago

Ok, I think I understand. Currently we are using ReadOnlyLuceneDataProviders anytime we need to read from the index and we have a single service that is in charge of writing updates to the index. Since there is only one service writing to the index, it isnt that big of a deal that there is a lock on the index because it is the only thing writing to it. Also, since our ReadOnlyLuceneDataProviders are being used in a website and they are set to Instance per Request in our IoC container, they will get created on each request anyways.

chriseldredge commented 9 years ago

That design should work, but you may find better performance by only creating ReadOnlyLuceneDataProvider once and sharing it across threads. It is designed to be thread-safe. You could refresh the provider on a timer or have some message sent from the writing process to tell the web servers that new data is available.

ChristopherHaws commented 9 years ago

That sounds good to me. By any chance is there a method for telling the existing provider to refresh its IndexSearcher or do I need to create a new instance and swap them? I noticed that there is a RegisterCacheWarmingCallback and a WarmUpContext in the LuceneDataProvider class, but from what I can see it is only Reload() is only called as part of a Commit. Would this be something that could be exposed to the RO provider as an easy mechanism for telling the provider to update its caches?

chriseldredge commented 9 years ago

There should be a Reload() but there isn't one yet.

ChristopherHaws commented 9 years ago

Awesome. Look forward to seeing it. In the meantime I made a Provider that creates a new instance of the ReadOnlyLuceneDataProvider and warms it up then swaps the current instance. Thanks for the awesome library and the support btw! :)