antlr / stringtemplate4

StringTemplate 4
http://www.stringtemplate.org
Other
955 stars 231 forks source link

Use an AtomicInteger for subtemplateCount to avoid race condition #220

Closed seanabraham closed 5 years ago

seanabraham commented 5 years ago

Right now, the way subtemplateCount is implemented is not thread safe since the ++ operator in java does not atomically read and write the incremented integer value. Even if each thread gets it's own STGroup, etc. we can get race conditions since the subtemplateCount field is static. This can lead to a template redefinition since a given thread may read the same for the integer twice, causing two subtemplates to have the same generated name if another thread reads the value before and writes the stale value afterwards.

I tracked this down while debugging an issue which spit out the error message like:

common.stg 8:125: redefinition of template SUBTEMPLATE

There are similar reports in https://github.com/antlr/stringtemplate4/issues/61

parrt commented 5 years ago

Howdy! I don't think that this stuff is thread safe in general. Is a worth patching just a piece of it? I think there are larger problems with multiple threads playing with a single template.

seanabraham commented 5 years ago

@parrt it's true there are other thread safety issues but this one is worse since it will crop up even if you're not sharing a template or STGroup between multiple threads. It can happen if there are at least two templates being loaded in the same JVM, even if they have nothing else to do with each other.


For context, imagine we have multiple threads which each have their own STGroup instance.

If the threads concurrently load templates which each have anonymous subtemplates from their respective STGroups around the same time the race condition can arise.

This is how it would happen:

When a template is being loaded each of its anonymous subtemplates need to be defined. In the process of defining the anonymous subtemplate the Compiler.getNewSubtemplateName() static method is called to get the name which includes a seemingly always increasing global integer that is appended to the string "_sub": https://github.com/antlr/stringtemplate4/blob/2a5ae02f7b528927ed32352c1c4a5c8d92aaaac8/src/org/stringtemplate/v4/compiler/CodeGenerator.g#L236

Within this method, the static subtemplateCount integer is incremented, but since the ++ operation is not atomic — the read and write are distinct operations — multiple threads may be calling it around the same time which can lead to the subtemplateCount variable being set to an old value and later read again.

For context, since the integer ++ operation is not atomic the line:

subtemplateCount++

effectively decomposes to something like the following operations:

int temp = read(subtemplateCount)
write(subtemplateCount, temp+1)

So, if subtemplateCount is currently set to 3 and you have, say, three threads calling this static method concurrently you could see something like:

thread1 - read(subtemplateCount) -> 3
thread2 - read(subtemplateCount) -> 3
thread2 - write(subtemplateCount+1) -> 4
thread3 - read(subtemplateCount) -> 4
thread3 - write(subtemplateCount + 1) -> 5
**thread1 - write(subtemplateCount + 1) -> 4**

and then if template that thread3 is defining has another anonymous subtemplate it will invoke the method again with the following result:

**thread3 - read(subtemplateCount) -> 4**
thread3 - write(subtemplateCount+1) -> 5
...

At this point, two different anonymous subtemplates will have the same name and eventually we'll crash with the compiler error redefinition of template SUBTEMPLATE

This is prevented via the use of the AtomicInteger class since the read + write are both contained in a single atomic operation.


Also, for what it's worth we are running hundreds of thousands of string template invocations every week and this is root cause of the only race condition we are currently seeing so I personally believe it's worth fixing and have reason to believe there are few others of this class, if any, in play (as long as no templates or STGroup instances are being shared among threads).

seanabraham commented 5 years ago

@parrt 👋, any further thoughts on this? Seeing as it's a relatively straightforward fix I'd love to see it merged if you agree. If you don't think that's something that can happen soon, that's okay but I'd love to know so I can plan to release a fork internally.

parrt commented 5 years ago

Sorry for delay, @seanabraham, brain stuck on teaching at moment. You're right! This can't be fixed by having one thread per template as it's a global var! zoiks. merging if you can add a commit to sign https://github.com/antlr/stringtemplate4/blob/master/contributors.txt

seanabraham commented 5 years ago

Excellent, thanks @parrt! I've signed contributors.txt in this PR: https://github.com/antlr/stringtemplate4/pull/222

seanabraham commented 5 years ago

@parrt out of curiosity — any chance the library is going to get a release soon? I'd be happy to help prep a release if there's anything I can do.

parrt commented 5 years ago

I guess it's been awhile and this is an important fix. @sharwell any thoughts?

sharwell commented 5 years ago

@parrt Happy to help if needed

parrt commented 5 years ago

I've added a release TODO on my calendar for when I finish this class I'm teaching mid-October