nitrite / nitrite-java

NoSQL embedded document store for Java
https://bit.ly/no2db
Apache License 2.0
825 stars 95 forks source link

bug: Nitrite#getCollection sometimes fails #988

Closed DarkAtra closed 2 months ago

DarkAtra commented 2 months ago

When trying the use Nitrite#getCollection after Nitrite#getRepository it fails with:

org.dizitart.no2.exceptions.ValidationException: A repository with same name already exists

Reproducer: https://github.com/DarkAtra/nitrite-java/blob/bug/get-collection-fails/nitrite/src/test/java/org/dizitart/no2/collection/CollectionFactoryTest.java#L60-L69.

Relevant lines:

database.getRepository(TestEntity.class);
database.getCollection(ObjectUtils.getEntityName(TestEntity.class));

I expect the above code to not throw an exception. As a developer, i should have full control over when i want to access my data using the repository vs. the raw collection.

Note: the same also happens when the calls are reversed (with a similar error message).

anidotnet commented 2 months ago

The enforcement is there for valid reasons:

  1. Opening a repository as a collection and modifying it could corrupt the data generating lots of bug
  2. Opening a collection as a repository would break the nitrite mapping of the entity.
  3. The name encoding function of entity - getEntityName is not part of public api and the logic or behavior could change at any time, and could break the user code in future.

These are some of the primary reasons why you cannot open a collection as repository or vice versa using the standard Nitrite.get* api.

Having said that, if you really want to get hold of the underlying collection of the repository, you could use - repository.getDocumentCollection() at your own risk.

DarkAtra commented 2 months ago

I don't think repository.getDocumentCollection() would work for my use case. Let me try to give a bit more context. I am using nitrite as a database for my discord bot. The bot itself is still heavily in development and the database format changes quite a bit from release to release. As i don't want to cause any manual work for users of my bot, i wrote a database migration service that automatically updates the database state from release to release. (Afaik, there was no migration capability in nitrite in the past and it was just recently introduced with version 4.x.)

My migration service operates on the raw collections. This was a deliberate decision to avoid having to adjust older migrations when package names change. It also allows me to keep the type that i operate all migrations on the same - Document or Collection.

Let's consider the following scenario:

Now in the above example, it would be entirely possible for me to use the repository for example.persistence.TestEntity but that would require me to update this migration step in the future when i change the package name again. So from my point of view, it is cleaner to use the low level access to the collection instead here.

I found that my issue only arises when i'm accessing the collection and repository for the same Entity at the same time. This is likely never happening in production code but is kinda common in test code, like:

Correction: The issue does also arise when first accessing the data through the repository and after that through the collection.

val oldCollection = database.getCollection("example.TestEntity")
val newRepository = database.getRepository(TestEntity::class.java)

assertThat(oldCollection.size()).isEqualTo(1)
assertThat(newRepository.size()).isEqualTo(0)

val oldDocument = oldCollection.find().first()

databaseMigrationService.migrateToLatestVersion()

assertThat(oldCollection.size()).isEqualTo(0)
assertThat(newRepository.size()).isEqualTo(1)

val testEntity = newRepository.find().first()
assertThat(testEntity.id).isEqualTo(oldDocument["id"])
assertThat(testEntity.name).isEqualTo(oldDocument["name"])

My issue right now is, that i don't want to migrate all the existing migration scripts to nitrite's builtin migration capabilities because it's just too much effort and doesn't really bring any real benefit over keeping my current solution.

If you're interested in the full code, refer to: https://github.com/DarkAtra/v-rising-discord-bot/blob/feat/leaderboards/src/main/kotlin/de/darkatra/vrising/discord/migration/DatabaseMigrationService.kt#L182-L219


Opening a collection as a repository would break the nitrite mapping of the entity.

I don't really understand what you mean by this. The repository is using the collection under the hood as well - How would it break if i do the exact same (given that i don't pollute the collection with junk data)?


The name encoding function of entity - getEntityName is not part of public api and the logic or behavior could change at any time, and could break the user code in future.

Regarding this one - i totally agree and i indeed wished that nitrite would allow me to specify the collection name to use when operating with repositories. This would completely eliminate the need for migration scripts in the above scenario as the name wouldn't change as it's no longer bound to the package name. Do you think adding this as a feature would make sense in future releases?

anidotnet commented 2 months ago

My issue right now is, that i don't want to migrate all the existing migration scripts to nitrite's builtin migration capabilities because it's just too much effort and doesn't really bring any real benefit over keeping my current solution.

I understand the initial effort might be high, but the dividend you will get would be worth it. The feature you are asking is already available in migration apis of Nitrite. If you start using the feature, if any other migration related needs arises in future, I might be able to support you in that via migration api, rather than tinkering with Nitrite.get* or any other apis (which I really don't recommend).

I don't really understand what you mean by this. The repository is using the collection under the hood as well - How would it break if i do the exact same (given that i don't pollute the collection with junk data)?

The answer lies in your questions. You might be cautious, but other users might not. Removing these constraints would introduce the chance of polluting data by mistake and hence open a floodgate of issues.

Regarding this one - i totally agree and i indeed wished that nitrite would allow me to specify the collection name to use when operating with repositories.

Repositories are bound to a type not name. The feature you are asking for is not meant for repositories but collection only which are bound by a name. Again I am imploring you to use the migration apis of Nitrite and all your needs would be fulfilled.

If you still don't want to use migration apis, I would suggest to look into the code of Nitrite's migration api and get hold of the underlying NitriteMap of your repository itself and perform your own migration.