j256 / ormlite-core

Core ORMLite functionality that provides a lite Java ORM in conjunction with ormlite-jdbc or ormlite-android
http://ormlite.com/
ISC License
578 stars 213 forks source link

Investigating odd stack trace from TableInfo.getFieldTypeByColumnName() #229

Closed dlew closed 3 years ago

dlew commented 3 years ago

I'm a little lost trying to figure out the root cause, figured maybe you'd have some idea...

We're running ORMLite on Android, and a few times a day, we're seeing a call to TableInfo.getFieldTypeByColumnName() fail in a seemingly impossible manner:

Caused by java.lang.IllegalArgumentException: You should use columnName 'name' for table organizations instead of fieldName 'name'
       at com.j256.ormlite.table.TableInfo.getFieldTypeByColumnName(TableInfo.java:168)
       at com.j256.ormlite.stmt.StatementBuilder.verifyColumnName(StatementBuilder.java:195)
       at com.j256.ormlite.stmt.QueryBuilder.addSelectColumnToList(QueryBuilder.java:662)
       at com.j256.ormlite.stmt.QueryBuilder.selectColumns(QueryBuilder.java:117)

I'm not sure how you can end up in this state. Based on my reading of getFieldTypeByColumnName(), it essentially:

  1. Creates a Map<ColumnName, FieldType> for the table.
  2. Tries to return the FieldType that matches the provided ColumnName.
  3. If it fails, it tries to figure out if maybe you provided a FieldName instead of a ColumnName.

That all seems reasonable to me; however, in the stack trace it shows the same FieldName and ColumnName. If the error-checking for-loop can find the right FieldType, then why doesn't the Map<ColumnName, FieldType> have it?


We get this crash only intermittently, not consistently, so it's not something fundamentally wrong with the calling code. I suspect some sort of concurrency issue, though I don't see anything obviously wrong here. It could be something with the oddities of SMP on Android. Perhaps the Map creation needs to be synchronized?

We've know we've seen this since ORMLite 5.3, though it's possible we saw it earlier than that (it's just that our crash logs don't go back that far).

Any ideas on how this could happen?

dlew commented 3 years ago

I've tried tracking down the concurrency angle further and I'm not convinced that's the issue. I tried launching a fake version of TableInfo that always recomputes the fieldNameMap, then run it concurrently a whole bunch of times, but I never ended up in this situtation.

Is there some case where FieldType is mutable? Or TableInfo.fieldTypes is mutable? It's not being mutated, as far as I can tell, but maybe something somewhere is doing something odd occasionally?

j256 commented 3 years ago

Yeah the error message is certainly confusing. It was a common problem for people to specify the field-name 'addr' instead of the column-name 'address' in a query. How about if the error said "I can't find a column named 'name' in the table information.". Is it possible that you are using the table-config-file that does not have the column in question?

dlew commented 3 years ago

If the table-config-file were incorrect, then wouldn't always crash on every execution? (Instead, we're seeing it crash maybe one out of every million executions.)

What's most confusing to me is that the exception seems to indicate that this should have worked, since fieldType.getFieldName() == columnName == fieldType.getColumnName().

One other angle - the exception check doesn't use downCaseString(). Perhaps that's why it matches the field name, but not the column name when downcased? (Again, though, this makes no sense because the exception says it failed on "name", which is already lowercase in English.)

j256 commented 3 years ago

The message is definitively screwed up but I think that it's saying the name isn't in the field list. I don't know why it isn't. One thing that I could do is spit out the fields in the object in the future but that doesn't help you now.

j256 commented 3 years ago

Could this be a synchronization thing? The map is created on demand but maybe it should be volatile?

dlew commented 3 years ago

The message is definitively screwed up but I think that it's saying the name isn't in the field list.

How could it not be, if it can point to a FieldType that has the exact name it was searching for?

Could this be a synchronization thing?

That's what I thought, and it could still be that, but:

  1. There's no modification to the map after creation, so if you have a reference to a map, it's essentially immutable.
  2. I tried throwing a concurrency test its way and couldn't get it to crash. (Perhaps my test was bad though.)

One thing that I could do is spit out the fields in the object in the future but that doesn't help you now.

FYI, this is not anything particularly urgent on our end, so if adding some extra debugging would help, I'm for it.

j256 commented 3 years ago

How could it not be, if it can point to a FieldType that has the exact name it was searching for?

That's why I wonder about concurrency. The field-map should use the same name so either the map has not been built yet or properly or was corrupted somehow.

There's no modification to the map after creation, so if you have a reference to a map, it's essentially immutable.

But if you have 2 threads that accessing this table and one initializes the map, the other one might get some partially updated version of the map. Immutability does not imply proper memory publishing. I don't know enough about the Android thread model to comment on whether or not multiple threads (like a UI thread and some sort of background processing) could be at work here.

dlew commented 3 years ago

Supposing this is an extremely rare concurrency issue due to Android threading, would you be open to either:

  1. Making fieldNameMap volatile.
  2. Initializing fieldNameMap in the constructor.

I'm partial to constructor initialization because it rules out an entire class of issues (concurrency), though I don't know how important this optimization is to performance.

I can put up a PR for either, just don't want to do that w/o your approval.

j256 commented 3 years ago

I already have the fix ready to submit but if you'd like to contribute some code I'm happy to take your's.

dlew commented 3 years ago

I was just offering to help in case that would make life easier for you, but if you've already got the code then go for it. :)

j256 commented 3 years ago

Thanks. Happy to have your PR and name on the submission as long as you don't mind me tweaking it after. :-)

dlew commented 3 years ago

I put up a PR here, lemme know what you think: https://github.com/j256/ormlite-core/pull/231