Open tonny1983 opened 1 year ago
Hi @tonny1983, Thanks for opening an issue. Let me start by saying that this is not a bug. There are two different APIs here that have their own documentation and behavior. I don't see any evidence that either of them behaves in any unexpected way. Let me go through the points you make in the issue.
Why the collation placeholder?
Databases are different. In fact, MS SQL server is the only DB we are supporting that is case insensitive. This is also the reason why there is a ${collationForCaseSensitivity}
placeholder in the API that can be configured to ignore the case. If you have a look at this class you will see that the ${collationForCaseSensitivity}
is only used to make SQL server compatible with this feature.
Why don't we let the DB decide? User can configure their DB in many ways that are not in the control of Camunda. We have to make some assumptions to make sure a feature works on all platforms. This is why we needed to tame MS SQL server here to make the feature work cross-platform.
Why do the two APIs work differently? This feature was initially introduced for filtering Task names. It was later rolled out for other queries but not all. We would need to adjust every query API individually and did not receive enough evidence that users were missing the case-insensitive search in other queries.
What now?
We will not modify the HistoricVariableInstance
because it works as intended and we can not roll back API for backward compatibility reasons.
For consistency, we might introduce the variableNamesIgnoreCase
flag in the VariableInstance
API. However, it is very unlikely that someone from the team will work on this soon. If you are interested to introduce it and make the two APIs more consistent we would happily accept a contribution.
Some open questions from my side:
variableNamesIgnoreCase
for the historic API to match Strings case-insensitively and it should then behave the same on MS SQL Server as the runtime API. Do you see any problem with this approach?variableNamesIgnoreCase
is not documented properly. What part do you feel is missing? Could you provide a description that you find is more fitting?Next steps: I will close the issue for now. We can discuss the matter further here. For now, there is nothing we will do from our side here.
Let me know what you think. Best, Miklas
Hi @mboskamp ,
Thanksvery much for your kind reply.
Let me answer the problem of the document first, because I'm afraid it is so simple.
For the description of variableNamesIgnoreCase
field of the POST /variable-instance
API in current doc, it is said that
Match all variable names provided in variableValues case-insensitively. If set to true variableName and variablename are treated as equal.
Obviously, the variableValues
there should be variableNames
.
Next, let's talk about the differeces about the two APIs. Actually, the variableNamesIgnoreCase
fields works properly just as the same as what the document said. However, you mentioned that
MS SQL server is the only DB we are supporting that is case insensitive
which is confusing me. Would you mind explaining it more detail please?
Above all, I think I can make a summary about the case sensitivity for name-like query as the following table:
API | Case Sensitive DB | Case Insensitive DB | Conclusion |
---|---|---|---|
Runtime API | C/S | C/I | depends on DB |
Historic API with variableNamesIgnoreCase | C/I | C/I | depends on the query condition |
Historic API w/o variableNamesIgnoreCase | C/S | C/I | depends DB |
Would you please also verify it?
Thank you for your time again.
Best, Tonny
Hey @tonny1983, Sorry for the wait. I think I understand now what the issue is. The table was very helpful to see the differences.
What is the problem?
The historic variable API POST /history/variable-instance
supports a properties variableNamesIgnoreCase
and variableValuesIgnoreCase
.
MS SQL Server by default treats strings case-insensitively so when setting one of the properties to false
variables are still treated case-insensitively (which is expected, because the DB is configured that way). When setting one of the properties to true
the query is configured to be case-sensitive (by using collation) and returns a different result set.
What is the solution?
The historic API should only configure the statement to be case-sensitive when one of the properties variableNamesIgnoreCase
or variableValuesIgnoreCase
is set to true
. See code. If not, then the collation should not be used and it should be up to the DB to decide.
What is not part of the solution?
The runtime API is not affected by this bug. As explained, the ignoreCase
properties are part of a feature that was not rolled out to the runtime API. That means it behaves as expected.
We can also not roll back the ignoreCase
properties from the historic API for compatibility reasons.
What will we do now? I will re-open the ticket and adjust the description. It will then go into our backlog. It might be prioritized by product management but that is not a given since our backlog is very big. If you are interested in fixing the issue in the historic API yourself, we would appreciate a contribution via PR. In that case, I would be available for giving input on solution ideas and code review. Please let me know if you like to work on this.
Cheers, Miklas
Hello @mboskamp , Thanks a lot for your review.
May I confirm that a reasonable solution should be:
variableNamesIgnoreCase
property to be true
should override the case sensitivity of DB configuration to make the query be case insensitive, and if the property is false
, the case sensitivity should depend on the DB configuration.If those are correct, with all due respect, I'm afraid there is something wrong about the problem you described.
The problem is not when the variableNamesIgnoreCase
property setting to be true
but when it is false
or missing (default as false
). As the code shows, the ${collationForCaseSensitivity}
placeholder will be added to the end of the query statement regardless of the value of the property. It works fine for true
case because two sides of the equal check have already be upper case and will not be affected by the collation. On the other side, for false
case, the statement will be forced to be case sensitive due the collation.
Thus, I'm afraid the correct solution is just to delete the ${collationForCaseSensitivity}
placeholder.
Another reason I continue to question the existence of the placeholder is its value. For SQL Server, the value is Latin1_General_CS_AS
which could invalidate other collation options of SQL Server (such as accent-sensitive_AS
, kana-sensitive_KS
and so on). The placeholder would limitate the query for other collation such as Japanese.
To fix the code of the problem may be simple. However, I'm afraid it would affect some features of Camunda Production. If these doubts and limitations clarify, I'm glad to create a PR for it.
Regards, Tonny
Hi @tonny1983,
Thanks again for clarifying. I don't recall the reason for the collation from the top of my head but I believe it was implemented like this with intent. I agree though that the behavior we are seeing is not expected. Right now, it is not clear to me what the best solution would be.
I will create a test PR to run your proposal of removing the collation entirely against our CI. I'll let you know what I find out.
Miklas
Hi @tonny1983, sorry for the long wait. I did not have the time to investigate further. I'll let you know once I had time to look into the CI results.
Hi @tonny1983,
I am sorry to inform you that we currently lack the capacity to bring this issue to a close. I will close the related PR. If you like to work on this bug yourself, we would very much appreciate a pull request. The issue will remain open, but I will remove it from our backlog. Like this, anybody may choose to work on it in the future.
Best, Miklas
Environment (Required on creation)
Description (Required on creation; please attach any relevant screenshots, stacktraces, log files, etc. to the ticket)
The SQL Server, which is case insensitive, is used as the database for Camunda. When querying variables via REST API (
POST /variable-instance
andPOST /history/variable-instance
) usingvariableNameLike
field, the runtime API returns results that are case insensitive (as the same configurations of the DB), but the history API returns only the case sensitive results.Steps to reproduce (Required on creation)
Just call those two APIs with the same
variableNameLike
valueObserved Behavior (Required on creation)
Expected behavior (Required on creation)
Root Cause (Required on prioritization)
For MS SQL Server the historic API is configured to use a special collation that makes the DB case-sensitive for this query only. See https://github.com/camunda/camunda-bpm-platform/blob/f6f49080bdc28eb5d5ce903c0cab7ebba058eaaa/engine/src/main/resources/org/camunda/bpm/engine/impl/mapping/entity/HistoricVariableInstance.xml#L354-L378
The collation is not only applied when the property is used but in any case.
Solution Ideas
Only apply the collation (used as a placeholder via
${collationForCaseSensitivity}
) for both properties when the respective property is set totrue
.