Open robertatpw opened 2 years ago
I took a look, and I don't see any changes between 0.17.4 and 0.40.1 around either readValues
https://github.com/JetBrains/Exposed/blob/master/exposed-dao/src/main/kotlin/org/jetbrains/exposed/dao/Entity.kt#L42
Or the lookup
:
https://github.com/JetBrains/Exposed/blob/master/exposed-dao/src/main/kotlin/org/jetbrains/exposed/dao/Entity.kt#L190
I'm wondering, though, is the Affiliate
model that you're using being deleted often in your system?
@AlexeySoshin Thank you for your reply. Sorry it took me so long to respond.
To answer your question, we never delete the Affiliates from the database. Perhaps, I'm not understanding the question so if my reply isn't what you were expecting please elaborate.
Also, since I posted this issue there have been some new developments. As a work around we implemented a retry mechanism which catches the exception and automatically retries the fetch operation which is working.
This obviously is not a long term solution but it does seem to indicate that there is some sort of race condition occurring. Since the lookup function seems to be synchronous I'm at a loss as to what could possibly be racing.
Here's the actual work around so you can see what I'm talking about.
TrafficeSourceRepository
fun getAffiliateData(trafficSource: TrafficSourceInterface): AffiliateInterface {
return acquireReader {
AffiliateData.createInstance(trafficSource.affiliate)
}
}
AttributionService
suspend fun getAffiliateWithRetry(requestId: UUID, trafficSource: TrafficSourceInterface): AffiliateInterface {
val trafficSourceId = trafficSource.trafficSourceId
var retries = 3
do {
retries--
try {
return trafficSourceRepository.getAffiliateData(trafficSource)
} catch (e: Throwable) {
"getAffiliateWithRetry" error message(
requestId,
"Exception on traffic source $trafficSourceId: ${e.message}, retries remaining ($retries)!",
e
)
}
delay(500)
} while (retries > 0)
return getAffiliate(trafficSource)
}
@AlexeySoshin There are a few other factors of which you should be aware that may not be obvious from my original posting.
We really need to find a solution to this problem because in the near future we want to move to a pattern of accessing all properties on the Exposed Entities in order to pass around plain old data objects. In the particular case of the Affiliate model, I was attempting to head off this same exception happening further down in our code by getting all of the data up front but instead of resolving that issue it just made the problem more pronounced and moved the exception to this "AffiliateData::createInstance" function rather than elsewhere in the code.
@robertatpw can you share the part of the code how trafficSource.affiliate
is created/filled?
From my side, it looks like it can be built? Also, Affiliate
entity and table mapping could help.
To answer your question, we never delete the Affiliates from the database. Perhaps, I'm not understanding the question so if my reply isn't what you were expecting please elaborate.
@robertatpw Thank you, that was exactly my question. I could expect a race between row deletion and the DAO cache update, but since you never delete it, that's not the issue. Also, thanks for clarification that not only Affiliate
entity is affected.
@robertatpw can you share the part of the code how
trafficSource.affiliate
is created/filled?From my side, it looks like it can be built? Also,
Affiliate
entity and table mapping could help.
@Tapac Thank you for your reply. By the way, I don't want to get too caught up on the specifics of how the Affiliate relationship works since as I mentioned previously this is happening on several different entities so it's not specific to our use of the Entity. Really the biggest question I need to get answered is why the blocking function Entity::lookup which appears to be completely synchronous seems to be acting asynchronous in some cases albeit rarely it's happening at least a handful of times on a daily basis when we're processing millions of transactions a day.
The problem may not be in the Entity class itself but perhaps in the entity manager but since I don't work on the Exposed code base on a daily basis like the contributors in this forum I was hoping to get some insight into what might've changed between v0.17.7 and now.
All of that being said here's a little snippet to show the traffic source relationship to the Affiliate table / entity. In this particular relationship there's a eager loading work around where in some cases we may have the Affiliate entity up front and can just set it on the traffic source immediately to prevent lazy loading behavior when we're pulling down all of the traffic sources in the database.
Again, I don't want to get caught up on this specific relationship because other places in the code where this same problem exists the relationships are very simple. In one particular case it's not even on a joined entity but it's happening on a primary entity which is newly created but passed around the code as an interface to a data object.
TrafficSourceTable
val affiliate = reference("affiliate_id", AffiliateTable).index()
TrafficSource
var lazyLoadedAffiliate by Affiliate referencedOn TrafficSourceTable.affiliate
private lateinit var eagerLoadedAffiliate: Affiliate
override var affiliate: Affiliate
get() = if (::eagerLoadedAffiliate.isInitialized) {
eagerLoadedAffiliate
} else {
lazyLoadedAffiliate
}
set(value) {
eagerLoadedAffiliate = value
lazyLoadedAffiliate = value
}
@Tapac Following up on my last reply, here's an example where the same problem happens just accessing a property on a newly created entity.
java.lang.NullPointerException: null
at org.jetbrains.exposed.dao.Entity.lookup(Entity.kt:194)
at org.jetbrains.exposed.dao.Entity.getValue(Entity.kt:174)
at com.acquireinteractive.loan.model.LoanRequest.getIdentityId(LoanRequest.kt:320)
at com.acquireinteractive.loan.service.LoanPhoneQualityScoreService.lookupPhoneQuality(LoanPhoneQualityScoreService.kt:34)
at com.acquireinteractive.loan.service.LoanPhoneQualityScoreService$lookupPhoneQualityAsync$1.invokeSuspend(LoanPhoneQualityScoreService.kt:23)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
LoanRequestTable
val identityId = uuid("identity_id").index()
LoanRequest
override var identityId by LoanRequestTable.identityId
A new LoanRequest entity is created on every request and passed around the code base to be referenced by many subsystems. In this particular case it's being used to get some phone quality information about the incoming request. The loan_request table schema is a very flat schema and you can see that the identity_id column is just a UUID. There's nothing special about this particular property. Again, this problem didn't occur in v0.17.7 of the Exposed library. It just started happening when we updated to v0.39.2 and is still happening now that we've upgraded to the latest available version.
Hello.
I am also experiencing issues described in this thread. For reference my library versions are:
We get this in our coroutine-based AWS Lambda that fans out a 1000s of rows to a Postgresql DB via AWS Aurora.
java.lang.NullPointerException
org.jetbrains.exposed.dao.Entity.lookup(Entity.kt:194)
org.jetbrains.exposed.dao.Entity.getValue(Entity.kt:174)
db.BroadcastTable$Broadcast.getType(BroadcastTable.kt:100)
The BroadcastTable
line there has this declaration:
var type by BroadcastTable.type
where BroadcastTable.type
is:
var type = text("type")
We don't see it often (twice now out of 1000s of executions), but it causes a batch process to fail and will require some form of retry to cater for this particular issue which is not ideal.
Any insight on resolution would be appreciated. Happy to provide more details if necessary.
Thanks
Description Recently upgraded from Exposed v0.17.7 to v0.39.2. Initially everything looked great until we moved to production and started finding this exception occurring intermittently. Running this code millions of times a day produces the following exception less than 10 times a day but it seems to indicate there is some change in the Exposed Entity / EntityClass which is failing now whereas it was working consistently in v0.17.7. We have upgraded to v0.40.1 hoping that would fix the problem but the problem persists in v0.40.1.
Dependencies kotlinLibVersion=1.7.10 kotlinCoroutinesLibVersion=1.6.4 kotlinLanguageVersion=1.7 ktorLibVersion=2.1.0 kodeinLibVersion=7.14.0 shadowJarLibVersion=7.1.2 awsKotlinLibVersion=0.17.8-beta exposedLibVersion=0.40.1 junitLibVersion=5.9.1 mockkLibVersion=1.13.2
AffiliateData
Exception