ppookk8765 / fluorinefx

Automatically exported from code.google.com/p/fluorinefx
0 stars 0 forks source link

ServiceInvoker is occasionally passing in the wrong service class #11

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
We currently operate in a highly concurrent (~10 req/sec) environment and 
occassionally see 'Could not find a suitable method with name...' errors on 
methods with valid methods, valid parameters, valid service classes, ect.  
The errors are quite random, but with the addition of some new logging code, 
I can see that MethodHandler.GetMethod () is passing in the wrong service 
type, and hence cannot find the method.

Original issue reported on code.google.com by trentnie...@yahoo.com on 2 Apr 2010 at 12:00

GoogleCodeExporter commented 9 years ago
We're experiencing this issue as well. The problem is a race condition in 
FluorineFx/Messaging/Endpoints/Filter/ProcessFilter.cs, lines 152-156. The 
FactoryInstance that is returned from destination.GetFactoryInstance() on line 
152 is shared between threads, and what often happens is that 
factoryInstance.Source is overwritten by another thread by the time 
factoryInstance.Lookup() is called! Then, the wrong class is used to find the 
remote method on, and the method cannot be found. It would be great if someone 
could fix this -- we don't know nearly enough about the internals of FlourineFx 
to feel safe implementing a fix ourselves, and we don't want to take the 
performance hit of just throwing a global lock around everything, which is 
definitely not necessary.

Original comment by zieDani...@gmail.com on 13 Jun 2010 at 6:16

GoogleCodeExporter commented 9 years ago
We experienced this as well. The solution for us was to define multiple 
destinations in the services-config.xml as opposed to one "fluorine" 
destination with multiple sources.
Fluorine by default expects your custom services-config.xml to be in 
WEB-CONFIG/flex,
You can of course change this in the source.

Original comment by Damien.B...@gmail.com on 29 Oct 2010 at 10:26

GoogleCodeExporter commented 9 years ago
I'm experimenting the same problem on a production server. Did you come up with 
an easy way to replicate this problem on a local environment?

Original comment by jerome.d...@gmail.com on 11 Nov 2010 at 10:02

GoogleCodeExporter commented 9 years ago
Couldn't reproduce locally.  Going to try specific source/dest between config 
and flex that Damien recommended.  Won't know it works until we stop getting 
100's of errors/day when we push live. :)

Original comment by trent.ni...@gmail.com on 11 Nov 2010 at 11:00

GoogleCodeExporter commented 9 years ago
Yes, I'm exactly in the same situation. In order to reproduce it locally, 
I'm trying to call my gateway from a C# console application. I'm using 
NetConnection and passing it "http://localhost:59632/ws/Gatewayx.aspx", is that 
the good way to go? I haven't got it working yet. 

Original comment by jerome.d...@gmail.com on 13 Nov 2010 at 12:15

GoogleCodeExporter commented 9 years ago
I assume that you need to pass in an AMF encoded message from .net.  You'll 
also need to initiate multiple calls at once and those calls should be on 
different services.

Original comment by trentnie...@yahoo.com on 13 Nov 2010 at 9:49

GoogleCodeExporter commented 9 years ago
Yes, you're right. I'm trying to make it work from a C# application. If that 
doesn't result I still can try it from a flash app, although it is not 
multi-threaded. I will keep you updated.

Original comment by jerome.d...@gmail.com on 17 Nov 2010 at 1:26

GoogleCodeExporter commented 9 years ago
Catch the sample AMF output with Service Capture, or similar.    You should 
then be able to simulate a request from flex by embedding your snapshot AMF 
packet.

Original comment by trentnie...@yahoo.com on 17 Nov 2010 at 2:41

GoogleCodeExporter commented 9 years ago
Here's a patch that seems to fix the problem. It just adds a few locks. Note: 
It might be out of date, the diff paths are wrong, and it includes a bunch of 
unnecessary whitespace changes made by Visual Studio :P.

Original comment by zieDani...@gmail.com on 17 Nov 2010 at 3:53

Attachments:

GoogleCodeExporter commented 9 years ago
@ zieDaniel1
  Trying your fix now... Will try to remember to update if this works for me :)

Original comment by Huo...@gmail.com on 18 Nov 2010 at 6:10

GoogleCodeExporter commented 9 years ago
Thanks @zieDaniel1!
Also trying the fix on a local setup.

Original comment by jerome.d...@gmail.com on 22 Nov 2010 at 3:44

GoogleCodeExporter commented 9 years ago
I uploaded the .NET client I'm using to test out FluorineFx with and without 
zieDaniel1's patch: https://github.com/jdecuyper/FluorineFxNetClient

Original comment by jerome.d...@gmail.com on 22 Nov 2010 at 9:52

GoogleCodeExporter commented 9 years ago
With the help of a small C# Fluorine client, I found out that the best way to 
generate the error is simply by putting the current thread to sleep when a 
specific type is fired, inside ProcessFilter.cs:  

factoryInstance.Source = amfBody.TypeName;
if (amfBody.TypeName == "aSpecialType")
 Thread.Sleep(500);
if (FluorineContext.Current.ActivationMode != null)//query string can override 
the activation mode
 factoryInstance.Scope = FluorineContext.Current.ActivationMode;
instance = factoryInstance.Lookup();

Since the current thread sleeps for half a second, it give enough time for 
another thread to override the Source value and cause the "Could not find a 
suitable method with name ..." error. When adding the lock proposed by 
zieDaniel1, error doesn't show up any more.

I have been trying to benchmark both DLL (with/without lock) locally and on a 
remote server but was not able to find a relevant discrepancy between both. 
Basically I have a big loop which fires one by one 100 methods and store the 
time elapsed between the petition and the response. Results vary between 0 and 
25 milliseconds. The average time with lock is 78 milliseconds and without lock 
75 milliseconds. What other kind of benchmark would you guys think of?

I uploaded a small graph to visualize the benchmark: 
http://jdecuyper.com/fx/index.html

Let me know your thoughts, thanks!

Original comment by jerome.d...@gmail.com on 4 Jan 2011 at 1:22

GoogleCodeExporter commented 9 years ago
A great fix for this for .NET 4.0 is to utilize the ThreadLocal<T> class.  The 
problem with this whole defect is in the /FluorineFx/Messaging/Destination.cs 
where the "GetFactoryInstance" method returns a singleton factory.  If you made 
the _factoryInstance variable a ThreadLocal<FactoryInstance> instead of a 
FactoryInstance then the FactoryInstance would be created for each thread and 
wouldn't have the contention discussed.

The only problem is with application scoped destinations.  (i.e. a destination 
with a <property> of <scope> set to 'application')  In that case, the instances 
will actually be thread scoped, not application scoped.  This could be fixed by 
modifying the DotNetFactory.cs 'Lookup' to retrieve the application scoped 
instance from the FluorineContext.Current.ApplicationState

Original comment by yinz...@gmail.com on 31 May 2012 at 8:55