Directory based approach for index encryption [LUCENE-9379]

asfimport commented 4 years ago

Important: This Lucene Directory wrapper approach is to be considered only if an OS level encryption is not possible. OS level encryption better fits Lucene usage of OS cache, and thus is more performant. But there are some use-case where OS level encryption is not possible. This Jira issue was created to address those.

The goal is to provide optional encryption of the index, with a scope limited to an encryptable Lucene Directory wrapper.

Encryption is at rest on disk, not in memory.

This simple approach should fit any Codec as it would be orthogonal, without modifying APIs as much as possible.

Use a standard encryption method. Limit perf/memory impact as much as possible.

Determine how callers provide encryption keys. They must not be stored on disk.

Migrated from LUCENE-9379 by Bruno Roustant (@bruno-roustant), 3 votes, updated Jun 03 2021 Linked issues:

3304
- 8023

Pull requests: https://github.com/apache/lucene-solr/pull/1608

asfimport commented 4 years ago

Bruno Roustant (@bruno-roustant) (migrated from JIRA)

So I plan to implement an EncryptingDirectory extending FilterDirectory.

Encryption method:

AES CTR (counter)

This mode is approved by NIST. (https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Counter_.28CTR.29)
AES encryption has the same size as the original clear text (no padding). So we can use the same file pointers.
CTR mode allows random access to encrypted blocks (128 bits blocks).
IV (initialisation vector) must be random, and is stored at the beginning of the encrypted file because it can be public. No need to repeat the IV for each block (less disk impact compared to CBC mode).
It is appropriate to encrypt streams.

API:

I don’t anticipate any API change.

How to provide encryption keys:

EncryptingDirectory would require a delegate Directory, an encryption key supplier, and a Cipher pool (for performance).

For the callers to pass the encryption keys, I see two ways:

1- In Solr, declare a DirectoryFactory in solrconfig.xml that creates EncryptingDirectory. This factory is able to determine the encryption key per file based on the path. It is the responsibility of this factory to access the keys (e.g. stored in safe DB, received with an admin handler, read from properties, etc). The Cipher pool is hold by the DirectoryFactory.

2- More generally the EncryptingDirectory can be created to wrap a Directory when opening a segment (e.g. in PostingsFormat/DocValuesFormat fieldsConsumer()/fieldsProducer(), in StoredFieldFormat fieldsReader()/fieldsWriter(), etc). In this case the PostingsFormat/DocValuesFormat/StoredFieldFormat extension determines the encryption key based on the SegmentInfo. A custom Codec can be created to handle encrypting formats. The Cipher pool is hold either in the Codec or in the Format.

Code:

I will inspire from Apache commons-crypto CtrCryptoOutputStream, although not directly using it because it is an OutputStream while we need an IndexOutput. And we can probably simplify since we have a specific use-case compared to this lib wide usage.

asfimport commented 4 years ago

Bruno Roustant (@bruno-roustant) (migrated from JIRA)

First PR, functional but incomplete. The idea of using a pool of Cipher does not work in Lucene.

To run the tests, two options:

test -Dtests.codec=Encrypting It executes the tests with the EncryptingCodec in test-framework. Currently it encrypts a delegate PostingsFormat. This option shows how to provide the encryption key depending on the SegmentInfo.

test -Dtests.directory=org.apache.lucene.codecs.encrypting.SimpleEncryptingDirectory It executes the tests with the SimpleEncryptingDirectory in test-framework. This option is the simplest; it shows how to provide the encryption key as a constant (could be a property) or only depending on the name of the file to encrypt (no SegmentInfo).

There is a performance issue because of too many new Ciphers when slicing IndexInput. javax.crypto.Cipher is heavy weight to create and is stateful. I tried a CipherPool, but actually there are many cases where we need to get lots of slices of the IndexInput so we have to create lots of new stateful Cipher. The pool turns out to be a no-go, there are too many Cipher in it.

TODO:

find a lighter alternative to Cipher if it exists.
fix a couple of tests still failing because of unclosed IndexOutput.

asfimport commented 4 years ago

Bruno Roustant (@bruno-roustant) (migrated from JIRA)

I updated the PR. Now it is functional and complete, with javadoc.

There should be no perf issue anymore because I replaced javax.crypto.Cipher by a much lighter code that is strictly equivalent, encryption/decryption is the same (tested randomly by 3 different tests).

For reviewers, there are 33 changed files in the PR but only 10 source classes, the other are for tests. Look for the classes in store package (e.g. EncryptingDirectory, EncryptingIndexOutput, EncryptingIndexInput) and the new util.crypto package (e.g. AesCtrEncrypter).

Now all tests pass when enabling the encryption with a test codec or a test directory.

Next step:

Run luceneutil benchmark to evaluate the perf impact.

asfimport commented 4 years ago

Bruno Roustant (@bruno-roustant) (migrated from JIRA)

Watchers, I need your help.

I need to know how you would use the encryption, and more precisely how you would provide the keys. Is my approach of using either an EncryptingDirectory (in the PR look at SimpleEncryptingDirectory) or a custom Codec (in the PR look at EncryptingCodec) appropriate for your use-case?

Note that both SimpleEncryptingDirectory and EncryptingCodec are only in test packages as I expect the users to write some custom code to use encryption. If you have an idea of a standard code that could be added to make encryption easy, please share your idea here.

asfimport commented 4 years ago

Bruno Roustant (@bruno-roustant) (migrated from JIRA)

@rmuir makes an important callout in the PR. A better approach is by leveraging the OS encryption at filesystem level because it fits the OS filesystem cache. That way the cached pages are decrypted in the cache.

So whenever it is possible, we must use OS level encryption. An OS filesystem encryption allows to encrypt differently per directory/file, and some allow to manage multiple keys.

But OS level encryption is not always possible. The example I can think of is running on computing engines on public cloud. In this case we don't have access to the OS level encryption (there is one but we cannot manage keys).

So this Jira issue propose a solution in the case we cannot use OS level encryption and we need to manage multiple keys. It should be stated well in the doc/javadoc. It is sub-optimal because it has to decrypt each time it accesses a cached IO page. So expect more performance impact.

asfimport commented 4 years ago

Bruno Roustant (@bruno-roustant) (migrated from JIRA)

I ran the benchmarks to measure the perf impact of this IndexInput-level encryption on the PostingsFormat (luceneutil on wikimediumall).

When encrypting only the terms file, FST file and metadata file (.tim .tip .tmd) (not doc id nor postings): Most queries run between -0% to -35% Wildcard -47% Fuzzy/Respell between -60% to -74%

It is possible to encrypt all files, but the perf drops considerably, -60% for most queries, -90% for fuzzy queries.

asfimport commented 4 years ago

Bruno Roustant (@bruno-roustant) (migrated from JIRA)

TaskQPS Lucene86 StdDevQPS EncryptionTim StdDev Pct diff Respell 41.55 (2.7%) 10.76 (0.9%) -74.1% ( -75% - -72%) Fuzzy2 44.81 (9.0%) 12.00 (1.1%) -73.2% ( -76% - -69%) Fuzzy1 41.03 (7.3%) 16.24 (1.9%) -60.4% ( -64% - -55%) Wildcard 28.02 (4.0%) 14.94 (2.0%) -46.7% ( -50% - -42%) OrHighNotLow 747.43 (4.2%) 485.90 (3.5%) -35.0% ( -40% - -28%) OrNotHighMed 524.60 (4.2%) 344.06 (2.9%) -34.4% ( -39% - -28%) OrHighNotHigh 576.32 (5.0%) 382.60 (4.0%) -33.6% ( -40% - -25%) OrHighNotMed 553.85 (4.1%) 371.73 (3.4%) -32.9% ( -38% - -26%) MedTerm 1116.53 (3.6%) 766.39 (2.6%) -31.4% ( -36% - -26%) LowTerm 1376.31 (4.2%) 947.48 (3.0%) -31.2% ( -36% - -25%) OrNotHighLow 492.68 (4.7%) 342.05 (4.7%) -30.6% ( -38% - -22%) AndHighLow 482.97 (3.8%) 342.18 (3.4%) -29.2% ( -34% - -22%) OrHighLow 410.23 (3.7%) 294.38 (3.8%) -28.2% ( -34% - -21%) HighTerm 971.63 (5.3%) 701.77 (3.2%) -27.8% ( -34% - -20%) OrNotHighHigh 493.99 (5.1%) 358.95 (3.9%) -27.3% ( -34% - -19%) LowPhrase 286.03 (2.9%) 246.04 (2.8%) -14.0% ( -19% - -8%) HighPhrase 290.25 (3.3%) 252.54 (3.4%) -13.0% ( -18% - -6%) Prefix3 51.36 (4.8%) 45.20 (4.1%) -12.0% ( -19% - -3%) AndHighMed 113.34 (4.0%) 105.77 (4.0%) -6.7% ( -14% - 1%) MedSloppyPhrase 79.83 (3.5%) 74.78 (3.6%) -6.3% ( -13% - 0%) HighTermDayOfYearSort 63.32 (13.3%) 59.34 (14.6%) -6.3% ( -30% - 24%) HighTermTitleBDVSort 86.16 (10.3%) 81.63 (10.0%) -5.3% ( -23% - 16%) LowSpanNear 58.07 (3.1%) 55.13 (3.2%) -5.1% ( -10% - 1%) AndHighHigh 44.58 (4.1%) 42.92 (4.2%) -3.7% ( -11% - 4%) OrHighMed 56.53 (4.4%) 54.65 (4.1%) -3.3% ( -11% - 5%) BrowseDateTaxoFacets 1.54 (4.6%) 1.50 (5.2%) -2.5% ( -11% - 7%) HighTermMonthSort 18.51 (10.5%) 18.06 (10.1%) -2.4% ( -20% - 20%) BrowseDayOfYearTaxoFacets 1.53 (4.7%) 1.49 (5.3%) -2.3% ( -11% - 8%) BrowseMonthTaxoFacets 1.77 (3.5%) 1.74 (4.2%) -2.1% ( -9% - 5%) HighSpanNear 12.75 (3.6%) 12.50 (4.1%) -2.0% ( -9% - 5%) MedPhrase 107.89 (3.2%) 106.01 (3.9%) -1.7% ( -8% - 5%) HighSloppyPhrase 12.86 (4.0%) 12.71 (4.7%) -1.2% ( -9% - 7%) MedSpanNear 11.76 (3.1%) 11.62 (3.4%) -1.1% ( -7% - 5%) HighIntervalsOrdered 13.61 (3.2%) 13.46 (3.3%) -1.1% ( -7% - 5%) OrHighHigh 11.12 (3.7%) 11.12 (4.1%) -0.1% ( -7% - 8%) BrowseMonthSSDVFacets 4.28 (3.9%) 4.29 (3.9%) 0.2% ( -7% - 8%) BrowseDayOfYearSSDVFacets 3.82 (3.7%) 3.84 (3.4%) 0.3% ( -6% - 7%) IntNRQ 25.54 (3.1%) 26.34 (3.4%) 3.1% ( -3% - 9%) PKLookup 174.98 (3.0%) 183.78 (4.5%) 5.0% ( -2% - 12%) LowSloppyPhrase 6.29 (3.5%) 6.89 (4.5%) 9.6% ( 1% - 18%)

asfimport commented 4 years ago

Bruno Roustant (@bruno-roustant) (migrated from JIRA)

I tested with FST ON-HEAP: we gain +15% to +20% perf on all queries.

I tested my Light version of javax.crypto.Cipher. It is indeed much faster for construction and cloning, but not for the core encryption. The reason is that two internal classes in com.sun.crypto have an @HotSpotIntrinsicCandidate annotation that makes the encryption extremely fast.

I tested with a hack version that takes the best of the two versions. It brings a cumulative +10% perf improvement.

So as a conclusion for the perf benchmark:

An OS level encryption is best and fastest.
If really it’s not possible, expect an average of -20% perf impact on most queries, -60% on multiterm queries.
If you need more you can make FST on-heap and expect +15% perf.
If you need more you can use a Cipher hack to get +10% perf.

asfimport commented 4 years ago

David Smiley (@dsmiley) (migrated from JIRA)

I'm glad you remembered on-heap FST.

Another option to improve performance more is a Java heap level cache. It could be added later and layered above this Directory (without being intertwined with this issue/code) if deemed worthwhile.

asfimport commented 4 years ago

Uwe Schindler (@uschindler) (migrated from JIRA)

How about the Solr Block Cache used for HDFS? It could be moved to Lucene (as HDFS is going away anyways).

asfimport commented 3 years ago

Bruno Roustant (@bruno-roustant) (migrated from JIRA)

I'm going to pause my work on this for some time, until there are comments added here that share use-cases where OS level encryption is not possible. If you can use OS level encryption, do so, it will be faster. If not, share your use-case here.

asfimport commented 3 years ago

Rajeswari Natarajan (migrated from JIRA)

We have a use case where we want to fit multiple index/tenant per collection and each index/tenant should have a separate key and we would like to use composite ID router. The use of composite id router do not limit each index/tenant per shard/directory . In this scenario , is OS level encryption possible?

asfimport commented 3 years ago

David Smiley (@dsmiley) (migrated from JIRA)

Rajeswari – you are referring to some SolrCloud concepts. The scenario you describe would often co-locate your "tenants", and thus any OS or Lucene Directory or Codec levels simply won't work. For example if you had a field "name" that's indexed, then it's an index for all docs in that index, spanning your multiple "tenants". Instead, you could either create separate Collections, or have one Collection with "implicit" (really explicit) shard creation/naming for each tenant, but you'd have to be careful in all you do to query/index a specific shard instead of accidentally querying the whole.

asfimport commented 3 years ago

Bruno Roustant (@bruno-roustant) (migrated from JIRA)

Rajeswari Natarajan maybe a better approach would be to have one tenant per collection, but you might have many tenants so the performance for many collection is poor? If this is the case, then I think the root problem is the perf for many collections. Without composite id router you could use an OS encryption per collection.

asfimport commented 3 years ago

Rajeswari Natarajan (migrated from JIRA)

@bruno-roustant and @dsmiley , if we go with implicit router, shard management/rebalancing/routing becomes manual. Solrcloud will not take care of these (In solr mailing lists always I see users are advised against taking this route) , so looking to see if encryption possible with composite id router and multiple tenants per collection . We might have around 3000+ collections going forward , so having one collection per tenant will make our cluster really heavy. Please share your thoughts and if anyone has attempted this kind of encryption

asfimport commented 3 years ago

Ming Zhang (migrated from JIRA)

@bruno-roustant In our case, we have dedicated collection for each tenant. Because it has so many tenants that it's not possible to serve them in single solr clsuter, we have multiple clusters. It has to have different encryption key for each collection as well. It looks this directory(tenant) based approach is able address our requirement. Looking forward to getting this enhancement soon.

asfimport commented 3 years ago

Martin Huber (migrated from JIRA)

@bruno-roustant - one very valid use case that is not solvable by means of OS encryption is if you want to ensure per index encryption (for single users or groups of users) preserving zero knowledge privacy of documents and the search index. OS level encryption, as far as I know, always allows file access to admins with local access to the filesystem as long as the encrypted volume is mounted. This only can be overcome with in-memory en/decryption.

So +1 for what you did ! 👍

asfimport commented 3 years ago

Robert Muir (@rmuir) (migrated from JIRA)

Sorry, the above comment is really wrong. Please see my comments on linked issues.

You can definitely manage encryption at multiple levels in the os:

block level
filesystem level

Please understand the options available and be educated about this, see: https://www.kernel.org/doc/html/latest/filesystems/fscrypt.html This FS-level crypto subsystem is usable with e.g. ext4 and f2fs filesystems, among others. So you can definitely do different stuff per-directory, which makes multitenant use-cases easily possible (and from my understanding, was the intent of the changes in the first place)

I won't drop my -1 vote on this because folks won't read the documentation for their operating system.

asfimport commented 3 years ago

Robert Muir (@rmuir) (migrated from JIRA)

As always, you can count on arch to have some good user-level wiki docs on how to do this: https://wiki.archlinux.org/title/Fscrypt

asfimport commented 3 years ago

Martin Huber (migrated from JIRA)

@rmuir thanks for the useful links.

But I didn't say, that per directory or per user encryption would not be possible. This alone is not our usecase.

What I said is, that a user with root access to a system can read all files of all users while the users directories are mounted / unlocked. Or he can become the user and then see the files.

Is this statement not right ?

And such there is no privacy.

asfimport commented 3 years ago

Robert Muir (@rmuir) (migrated from JIRA)

Your argument is even more uneducated, the "i can do better than encryption at rest" argument. Get out of town!

Lucene depends on the OS page cache for performance. So if you want to encrypt stuff, you need to use the operating system. Also, encrypting storage is non-trivial, and this is a search engine project. Every time someone makes a patch for this issue, its never a standard mode like AES-XTS, it's always some insecure homemade garbage!

I'm standing by my decision. Creating more JIRA issues or making more arguments won't help the situation.

asfimport commented 3 years ago

David Smiley (@dsmiley) (migrated from JIRA)

Rob, please tone down your language. Don't speak of how much others are "uneducated"; merely point to what you want to show to help others understand your point of view.

asfimport commented 3 years ago

Bruno Roustant (@bruno-roustant) (migrated from JIRA)

RE AES-XTS vs AES-CTR: In the case of Lucene, we produce read-only files per index segment. And if we have a new random IV per file, we don't repeat the same (AES encrypted) blocks. So we are in a safe read-only-once case where AES-XTS and AES-CTR have the same strength [1][2]. Given that CTR is simpler, that's why I chose it for this patch.

[1] https://crypto.stackexchange.com/questions/64556/aes-xts-vs-aes-ctr-for-write-once-storage [2] https://crypto.stackexchange.com/questions/14628/why-do-we-use-xts-over-ctr-for-disk-encryption

asfimport commented 3 years ago

germafab (migrated from JIRA)

Thanks [\~broustant]/@bruno-roustant, this is also something that I was looking for!

As for @rmuir's comment(s): I think the important distinction to be made is the goal of the usage of encryption and the guarantees you need.

If one needs tenant based encryption at rest, os level encryption is a valid way to go. Also if one needs maximum performance and tries to squeeze every last drop of performance out of their NVMe's - os level encryption (or no encryption) would probably be best.

BUT: In todays world there are sometimes things that are more important (or pose a greater risk) to a project or a company: namely user privacy and data protection. In such cases decreased performance is certainly acceptable (if not already anticipated).

Many of the above arguments against this contribution can be addressed one way or another. What can NOT be addressed (and why @bruno-roustant's contribution is valuable) is:

It allows for the stored content to only be accessible to Lucene (the process/thread), for the exact duration that Lucene needs to process the data, without any dependency on a downstream component.
It allows for platform interoperability/independence. (Example: ) This allows the solution to be deployed to Linux system, while being developed on MacOS/Windows. (Sidenote: This is very important if there are large teams working on solution building on this.)
It can even offer protection from passive privileged users - meaning that the file on the filesystem is not readable for a privileged user. In contrast to that the os-level encryption that would make such protections more complex.
It allows for simple deployment in container technologies (which would be tricky with the alternatives proposed by @rmuir)

Maybe the increased interest in this topic signals that there is something to be done?

Also recent research has taken note - like: (From the abstract: ) "[...] However, currently deployed IR technologies, e.g., Apache Lucene - open-source search software, are insufficient when the information is protected or deemed to be private [...]" (Source: https://www.computer.org/csdl/journal/tq/5555/01/08954811/1gs4XOshKHC)

osnatShomrony commented 5 months ago

With the new requirement of PCI 4.0 that disk encryption cannot be the only protection for data at rest, this contribution becomes very crucial, is there any progress with this ? https://www.vikingcloud.com/blog/pci-dss-v4-are-you-using-disk-encryption

apache / lucene

Directory based approach for index encryption [LUCENE-9379] #10419

3304

8023