simbabarry / smslib

Automatically exported from code.google.com/p/smslib
0 stars 0 forks source link

Enhance Service with simple lifecycle management. #181

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi

I think Service can be enhanced with simple lifecycle management to reduce
unnecessary synchronizations. For example, Service.sendMessage can
perfectly work without synchronization on gateways list at all. Router
maybe requires it, but that's another story.
If Service will protect gateways list from external modifications by
lifecycle management (i.e. throw exception if gateways will be added after
Service.start was invoked) and from internal modifications by contract, it
will make Service much more effective.
Bottom line: I have feeling that synchronizations should be reviewed once
again to make gateway utilization more effective. For current 3.4 it just
doesn't allow parallel sending of messages.

Thanks.

Original issue reported on code.google.com by allat...@gmail.com on 5 Feb 2009 at 10:58

GoogleCodeExporter commented 9 years ago
Hi,

No objections here, but I am mostly worried about your last phrase:
<<<
For current 3.4 it just doesn't allow parallel sending of messages.
>>>

I've emulated the parallel sending with the test gateway and seem to be working 
ok, 
with the necessary adjustment of the new Settings.SCHEDULER_POOL configuration 
option.

Is your experience different? Do you have any details to share?

Original comment by T.Delenikas on 5 Feb 2009 at 11:20

GoogleCodeExporter commented 9 years ago
Yes, but sendMessage is synchronized on gateways list. Sending is done in 
separate
threads, yes. But only one thread is sending message in single moment of time.

I have fixed it, however I encountered severe issues with threads management 
(all
threads are blocked by time-consuming operations and it blocks Service 
operations
completely; I will create separate issue for it as soon as I figure out 
acceptable
solution)

Original comment by allat...@gmail.com on 5 Feb 2009 at 11:26

GoogleCodeExporter commented 9 years ago
With how many modems are you working?

Original comment by T.Delenikas on 5 Feb 2009 at 11:33

GoogleCodeExporter commented 9 years ago
1 physical (and 4 TestGateways if I need to test concurrency).

Original comment by allat...@gmail.com on 5 Feb 2009 at 11:36

GoogleCodeExporter commented 9 years ago
I think we should allow for disabling or even outright removal of the
KeepAlive thread of the modem gateway.  If this is done, it will
become the responsibility of the watch dog thread to restart the
gateway should a timeout occur.  I think this will remove any
synchronization needs in the modem gateway itself since it is only the
keep alive thread that is requiring the lock on the output stream.

Original comment by jjon...@gmail.com on 5 Feb 2009 at 12:29

GoogleCodeExporter commented 9 years ago
KeepAlive was originally created to "ping" the modem in order to keep the 
connection 
alive. Implicit timeouts were observed in (mainly) bluetooth connections - 
those 
connections were dropped after some inactivity.

I know that bluetooth is not the best or preferred way to connect to a modem. 
But 
should we remove it? Do you think that it affects throughput in a significant 
way?

Original comment by T.Delenikas on 5 Feb 2009 at 7:04

GoogleCodeExporter commented 9 years ago
I would appreciate your feedback for r1738. Details in the log message.

Original comment by T.Delenikas on 6 Feb 2009 at 9:00

GoogleCodeExporter commented 9 years ago

Original comment by T.Delenikas on 6 Feb 2009 at 9:00

GoogleCodeExporter commented 9 years ago
Hi.

A little description about what I originally encountered:
1. thread pool has core size of 3
2. 5 modems total, 3 modems offline
3. all 3 threads are blocked by ATHandlers waiting for commport timeout
4. queue is slowly filling up to 100 tasks
5. when tasks queue is full, even additional threads (maximum pool size) is 
unable to
cope with queue growing size, so we are flooded by TaskRejectedException (or
something like it, I don't have logs here)

My original fix was to create two types of tasks: with fixed rate and fixed 
delay. It
effectively solved problem with queue growing size, but core threads pool 
blocking by
KeepAlive stays the same.

I was thinking hard on this problem, but came to conclusion that the only 
solution
would be to define group of tasks that should not and must not execute in 
parallel
for a single Gateway (for example, KeepAlive and QueueManager). It will prevent 
tasks
from occupying Threads just to sit there waiting for lock release.

I think there should be only two "entities" synchronized: operations on 
ATHandler,
messages queue and gateway list. Messages queue can be implemented via Vector 
(it is
threadsafe, so there won't be any need to synchronize on it explicitly), 
gateway list
should be protected from modifications by Service, so there should be no need to
synchronize operations on it. And ATHandler... What do you think about combining
read/write operations into single one? For example, following generic method:
public synchronized String execute( String operation ). This method will send
'operation' into modem and wait for response. This method will guarantee that 
only
one operation will be executing on modem in single moment of time and . 
Additionally,
there may be defined Map<String, Integer> for ATHandler that will hold amount 
of time
to wait if special operation is executed (for example, '+++' or 'ATZ'). Won't 
there
be any issues with this approach (performance, operations ordering)?
Are there any need for additional synchronizations?

Good day.

Original comment by allat...@gmail.com on 7 Feb 2009 at 12:17

GoogleCodeExporter commented 9 years ago
Hi Vadim,

I think there is a definite improvement in the last commit (r1378) - at least 
that's 
what I see by emulating various numbers of "test" gateways combined with 
various core 
tasks. I used to get blocking, reject exceptions and everything else you 
describe, 
but the lib seems much more stable now. I got rid of several synced blocks, all 
gateways work in parallel now (assuming you've modified QUEUE_THREADS 
accordingly to 
keep all gateways busy).

My goal is to also automatically adjust the three new settings which affect the 
queues and other threads, in order not to expect from the end user to adjust 
those.

You last paragraph seems a bit overcomplicated to me... A couple of ideas are 
now 
implemented (like the gateway list being protected by the Service). But, do we 
really 
need this complex approach for the at handlers?

Original comment by T.Delenikas on 7 Feb 2009 at 3:30

GoogleCodeExporter commented 9 years ago
Good day.

Current problem is that two threads can invoke "write" at the same time. In 
this case
returned value from "response" is undefined ("thread1" can read response from
"thread2"'s command and vice versa). And it is resolved by synchronizing on
"SYNC_Commander".

I think proposed approach is actually simpler that the current. Really, all AT
operations are a pair of "request" and "response". There is no reason to split 
single
operation into two different methods ("writeRequest" and "readResponse"). If
ATHandler will guarantee that at any time it will perform only one operation, 
we'll
be able to remove almost all synchronized blocks in ModemDriver. It will remove
unnecessary complexity in methods and can slightly improve performance: invoking
synchronized method is a little faster than entering synchronized block. And 
more
important: less synchronized blocks - less possibility to catch deadlock 
eventually.

Original comment by allat...@gmail.com on 9 Feb 2009 at 1:53

GoogleCodeExporter commented 9 years ago
I used to have all ATHandler's method synced (some version ago). I started to 
have
problems once I've added code to handle the unsolicited modem responses (which
required two distinct READ operations, for example the ring indication which 
consists
of one RING and one CLIP). At that moment, I moved the lockings up a level.

Original comment by T.Delenikas on 9 Feb 2009 at 2:35

GoogleCodeExporter commented 9 years ago
Hi. Sorry for late patch, but here it is nonetheless :)

Changes:
1. Task lifecycle has been decoupled from scheduler
2. Task now supports execution at fixed rate and with fixed delay (should solve 
any
problems with task queue overflow and makes relevant changes in QueueManager 
obsolete)
3. QueueManagers now process messages until queue becomes empty and only after 
that
task finishes it's run.
4. Small names refactoring in Scheduler package. I think now they correspond to
class' responsibility better. It's up to you to decide if it should be left as 
is.
5. One questionable change is thread pool size reassignment at
org.smslib.Service.startService(boolean). I would have removed it, but have been
afraid to corrupt patch file :)

Original comment by allat...@gmail.com on 11 Feb 2009 at 12:34

Attachments:

GoogleCodeExporter commented 9 years ago
I will get back on this once I take a look at this patch.

Original comment by T.Delenikas on 12 Feb 2009 at 6:58

GoogleCodeExporter commented 9 years ago
One question pls - I am looking through your code.

Your modified QueueManager has a loop (your #3 comment). Assuming that we have 
a 
large queue, won't this code keep the modems extremely busy until the entire 
queue 
has been processed? Or I am missing something?

Original comment by T.Delenikas on 18 Feb 2009 at 7:50

GoogleCodeExporter commented 9 years ago
Yes, the point was to keep modem busy while there are messages to be sent. Now 
when
you mention it... could it be the reason why modem can't register in the 
network?...

Original comment by allat...@gmail.com on 19 Feb 2009 at 2:53

GoogleCodeExporter commented 9 years ago
From my experience, its a (very) bad thing to keep a modem 100% busy and 0% 
idle...
it can easily freak out (reboots, constant CMS errors, etc). That's why the 
current
queue implementation leaves some idle time in between.

Original comment by T.Delenikas on 19 Feb 2009 at 3:00

GoogleCodeExporter commented 9 years ago
One more thing: assuming that you have a big queue, by keeping the modems busy 
you
won't be able to receive anything during the dispatch. If it takes you half an 
hour
to get rid of the entire queue, your inbound messaging will be dead for this 
period.

Original comment by T.Delenikas on 19 Feb 2009 at 3:06

GoogleCodeExporter commented 9 years ago
Yes, you are absolutely right. This loop was a mistake.
Queue managers number should be exact as the number of running gateways. Maybe
gateway should itself manage state of it's queuemanager to disable and enable 
it when
appropriate?

Original comment by allat...@gmail.com on 20 Feb 2009 at 9:16

GoogleCodeExporter commented 9 years ago
Hmm... Your last sentence sent us back to the time when the queue manager 
belonged to 
the gateway... Remember our old discussions? :)

Now the queue managers belong to the Service. The number of queue managers 
could be 
as many as the number of OUTBOUND gateways. Having less queues may be 
beneficiary as 
well, as its a means of not keeping all gateways at 100% utilization (which is 
bad if 
we are talking about modems).

Original comment by T.Delenikas on 21 Feb 2009 at 6:48

GoogleCodeExporter commented 9 years ago
If you will create 5 gateways and one will permanently fail (for example, our 
clients
report dead sim cards now and then), then you will see the problem. Fifth
queuemanager will try to send message and will always fail, because 4 active 
gateways
are already occupied by 4 managers. This send cycle will just increment send 
attempts
of the message without actual "attempt" to send it.

Queue should be central, however queuemanager should not. If you will draw 
smslib
architecture from class' responsibility point of view, I'm sure you will come 
to the
same decision. Because only gateway knows when it can handle message and how 
many
messages and with what interval, etc etc. For example, SMPP doesn't need any 
delay at
all. In transceiver mode it can send and receive simultaneously. With current 
default
queuemanager implementation every message is sent with 300-400ms delay at best
(Scheduler tick). I think it reduces SMPP potential throughput greatly. So SMPP
gateway should be able to occupy thread for sending messages while there are 
messages
to be sent and send them continuously.

Original comment by allat...@gmail.com on 21 Feb 2009 at 7:08

GoogleCodeExporter commented 9 years ago
The current responsibility of the QueueManager is simply to pick up a message 
from 
the queue and to call SendMessage() in order to send it.

Unless you have a redesign in mind, I fail to see why the QueueManager belongs 
to the 
Gateway. A message does not always keep a "preferred gateway" information. So, 
gateway A QueueManager would actually push a message to gateway B. What I am 
saying 
is that the QueueManager could belong to a Gateway only if the routing and 
preferred 
gateway information is available from the beginning - before the message is 
even 
pushed in the queue.

Original comment by T.Delenikas on 21 Feb 2009 at 7:20

GoogleCodeExporter commented 9 years ago
Here are descriptions of two possible designes:
1. Modem should take a message from central queue when it is ready for sending 
(or we
can say "ask Service for messages for this particular Modem" to take message 
routing
into account)
2. Service should periodically try to send messages until there is Modem ready 
to send it

I think option 1 is more "natural". Only in this case Modem can send messages 
at the
rate that fits it best. Maybe it will involve significant redesign, however I 
hope it
will not :)

Original comment by allat...@gmail.com on 21 Feb 2009 at 7:32

GoogleCodeExporter commented 9 years ago
I like your first idea. :)

Lets assume that we make the central queue "smart", for example smart enough to 
pull 
messages for a specific gateway. Also, we transfer the QueueManagers to the 
Gateway 
level, applying a specific "rate" logic in them (for example, a modem gateway 
would 
query for messages at a much slower rate than an HTTP or SMPP gateway).

Now, what would happen with messages not having the "preferred-gateway" info? 
Should 
we also have another central QueueManager for pushing those?

Original comment by T.Delenikas on 21 Feb 2009 at 7:54

GoogleCodeExporter commented 9 years ago
For example, we have following method in Service:
public OutboundMessage pollMessage( AGateway gtw );

It will select best message for this specific gateway. How this queue will be
implemented is not important actually as long as it follows it's description.

We can iterate over one big list of messages looking for one that has this 
gateway in
it's "routable" list. In this case "routable" list should be cached to reduce 
search
impact.
Or we can create N lists for N gateways. Message will be inserted into list of 
every
modem in it's "routable" list. One downside is that message should be removed 
for N
lists atomically which will require synchronization (bad bad bad).
I think there are other better options here, it will just take some time to 
find them.

Original comment by allat...@gmail.com on 21 Feb 2009 at 8:33

GoogleCodeExporter commented 9 years ago
I am thinking of proceeding this way:

1) Create a master Queue as well as a Queue per gateway.
2) Create a master QueueManager as well as a QueueManager per gateway.
3) A message will, at the beginning, be added in the master Queue.
4) The master QueueManager will route it to the appropriate gateway's Queue.
5) Each gateway Queue/QueueManager will work as it used to. It will be 
responsible of 
keeping its own pace.
6) If a message from a gateway queue fails to be sent, it will be put back to 
the 
master Queue (cleared of its preferred gateway information) in order to be 
rerouted 
probably via a different gateway.

Last but not least, something should be done in order to allow for scheduled 
threads 
to run faster than the tick resolution (to support fast gateways, like the 
HTTP). 
Maybe decouple them from the thread scheduler and run them as standalone 
threads? Or 
ditch the current thread implementation and start again from the v3.3 status? I 
am 
more for the second solution, although it would be a real pain to revert the 
code to 
the old status...

Original comment by T.Delenikas on 22 Feb 2009 at 7:36

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I miss why there must be per-gateway queue. We will have the same situation as 
with
v3.2 - draining gateway queue to master queue if something goes wrong and 
necessity
to balance per-gateway queues to maintain proper load on gateways.

As for threads sheduling - why not switch to
java.util.concurrent.ScheduledThreadPoolExecutor
? It may not have
"enable/disable" or "runNow" functionality, but it doesn't depend on ticks. 
Looks
like it at least.

Original comment by allat...@gmail.com on 22 Feb 2009 at 12:11

GoogleCodeExporter commented 9 years ago
<<
I miss why there must be per-gateway queue. We will have the same situation as 
with
v3.2 - draining gateway queue to master queue if something goes wrong and 
necessity
to balance per-gateway queues to maintain proper load on gateways.
>>
But who will perform the routing (i.e. Router/Balancer logic) if there is only 
one 
queue and multiple queue managers? Will every gateway iterate all messages in 
the 
master queue and call the Router/Balancer logic for each one to see if there is 
a 
match????

<<
As for threads sheduling - why not switch to
java.util.concurrent.ScheduledThreadPoolExecutor
? It may not have
"enable/disable" or "runNow" functionality, but it doesn't depend on ticks. 
Looks
like it at least.
>>
I thought so, too. But I doubt that this will be as efficient as a plain Thread 
(with 
a while/sleep structure) for fast switching tasks, like the one you will need 
for 
SMPP.

I will make an attempt to switch to the ScheduledThreadPoolExecutor in the 
following 
hours anyway. Certainly seems more appropriate for the current framework. 
Thanks :)

Original comment by T.Delenikas on 22 Feb 2009 at 12:34

GoogleCodeExporter commented 9 years ago
>>
But who will perform the routing (i.e. Router/Balancer logic) if there is only 
one 
queue and multiple queue managers? Will every gateway iterate all messages in 
the 
master queue and call the Router/Balancer logic for each one to see if there is 
a 
match????
<<
Tricky part. Message routing information should be calculated at the moment it 
enters
smslib as it won't change after that, will it? If there is only one queue - 
balancing
is not necessary since gateways will handle it themselves. Maybe there is 
something
useful in issue 177 [1].

>>
I thought so, too. But I doubt that this will be as efficient as a plain Thread 
(with 
a while/sleep structure) for fast switching tasks, like the one you will need 
for 
SMPP.
<<
SMPP may have it's own queue manager implementation which won't stop until 
there are
no more messages to be sent via SMPP.

[1] http://code.google.com/p/smslib/issues/detail?id=177

Original comment by allat...@gmail.com on 22 Feb 2009 at 3:58

GoogleCodeExporter commented 9 years ago
r1768: Switch to ScheduledThreadPoolExecutor() implementation.

The Queues' issue is still pending.

Original comment by T.Delenikas on 22 Feb 2009 at 8:21

GoogleCodeExporter commented 9 years ago
Back to the Queues issue:

I think that the per-gateway queue concept will not be avoided (unfortunatelly).

SMSLib should manage different gateways, with different timing requirements. For
example, the Modem gateway is the slowest of all, the HTTP gateway is a decent
performer and the (future) SMPP a real fast one. Implementing these different 
timing
and stepping issues in a per-gateway queue architecture (instead of a central 
one) is
the best approach, at least according to my logic. Of course, as you pointed, 
its not
the easiest one (re-balance the queues in case of failures, etc).

Its more or less what I've described in comment #26 above.

Regarding Issue #177, I think this could be implemented in this scenario as 
well.
Imagine the per-gateway queue pushing back a failed message to the central 
queue in
order to be re-routed according to the new priority rules (this is, though, a 
big
change and I will probably leave it for v3.4+).

Original comment by T.Delenikas on 26 Mar 2009 at 8:17

GoogleCodeExporter commented 9 years ago
I don't quite understand why you think per-gateway queue is the best 
approach... You
make central queue responsible for feeding gateway queues with appropriate 
amount of
messages (i.e. now it must know gateway internals). Why
gateway-requests-central-queue-for-new-messages-when-it-is-ready approach is 
worse?

Original comment by allat...@gmail.com on 26 Mar 2009 at 9:13

GoogleCodeExporter commented 9 years ago
I don't believe its worse per se, its just seems more logical to me as far as 
to what
component is responsible for what task.

For example, it seems more logical to me that if a gateway fails, he would be
responsible of pushing the remaining messages from his queue back to the central
queue for re-dispatch. Otherwise it would have to somehow inform the Service in 
order
iterate the central queue and remove the message-gateway association that has
happened before the failure.

Original comment by T.Delenikas on 27 Mar 2009 at 7:18

GoogleCodeExporter commented 9 years ago
Yes, I totally agree with you that gateway is responsible to return unsent 
message if
it fails. But I still don't understand why you think it would be problem in CQ
(central queue) case.

Original comment by allat...@gmail.com on 27 Mar 2009 at 9:57

GoogleCodeExporter commented 9 years ago
I think it would be nice (and easier, if I can say this) to have a central 
authority 
for the dispatching of messages. For example, it would be easy to add your 
suggestion (Issue #177) in the CQ, instead of having standalone methods here 
and 
there or (worse) implement inter-gateway communication.

Original comment by T.Delenikas on 28 Mar 2009 at 12:14

GoogleCodeExporter commented 9 years ago
I think we misunderstand each other... Only difference we have is the following:
yours - CQ should route messages to gateways as soon as it receives them from 
client
(these messages are stored within PGQs (per-gatways queue))
mine - CQ should store all messages and gateways should ask queue to give them
messages to send (i.e. CQ makes final decision what message gateway will 
receive by
polling queue)

In both cases routing is performed by CQ, gateways take passive role here.

Here is activity diagram of message sending process from gateway point of view. 

Original comment by allat...@gmail.com on 28 Mar 2009 at 12:40

Attachments:

GoogleCodeExporter commented 9 years ago
I don't think that we really misundestood each other. I understand what you are 
saying - if you will, our difference is whether the gateways will have their 
own 
queues or not.

Your diagram makes sense. Lets give it a shot your way (I am in a hurry to 
publish a 
first beta of v3.4, since there are corrections for several bugs).

The question is now how to implement the data structure holding the central 
queue. Do 
you agree to go with a Hashmap of Priority Queues? The hash key would be the 
gateway 
id and the priority queue will cater for correct sequencing. It should be 
faster than 
a big list, where we would have to iterate and compare.

Original comment by T.Delenikas on 28 Mar 2009 at 8:01

GoogleCodeExporter commented 9 years ago
I'm not in the position to advise, but why not defer it for post-3.4? CQ will 
be a
drastic change and should be carefully implemented... There is still severe 
issue
with QueueManagers and Gateways number which leads to the fast message dropping
because of unsuccessful delivery attempts, and it looks like I forgot to report 
it :)

Original comment by allat...@gmail.com on 28 Mar 2009 at 9:01

GoogleCodeExporter commented 9 years ago
Hmm... I guess we can leave this for a later stage.

Anyway, the current trunk has a CQ as well as a per gateway queues (similar to 
the 
previous version), so the QueueManagers' issues you are referring to is 
probably 
history... Unless you refer to something new(?)

Original comment by T.Delenikas on 28 Mar 2009 at 9:22

GoogleCodeExporter commented 9 years ago
Yep, checked code in the trunk - PGQ resolved the issue (I solved it by moving
QueueManagers into Gateways and enabling/disabling them when Gateway changes 
it's
status).

Original comment by allat...@gmail.com on 28 Mar 2009 at 10:16

GoogleCodeExporter commented 9 years ago

Original comment by T.Delenikas on 29 Mar 2009 at 1:27

GoogleCodeExporter commented 9 years ago
This is a first draft of what I had in mind after our discussion:

http://code.google.com/p/smslib/source/browse/smslib/branches/Queues/src/java/or
g/sms
lib/QueueManager.java?spec=svn1851&r=1851

A central queue managing class, holding a <Map of PriorityBlockingQueues>. 
Includes 
helper methods to manage the map, add queues for gateways, add/remove messages 
and 
some other informative methods (getLoad() etc). The Service class has been 
adopted to 
push messages to this QueueManager class. The routing logic is missing, but it 
will 
be added in the QueueManager class (once the queueMessage() method is called).

In this revision, Gateways are not updated to retrieve messages from this class.

The QueueManager will also be extended to offer functionality to cover Issue 
#196 and 
Issue #197.

Comments?

Original comment by T.Delenikas on 1 Apr 2009 at 7:32

GoogleCodeExporter commented 9 years ago
In addition to the router logic, message retrieval method is missing also :)

If one message can be queued for only one gateway at any given time - this
implementation is fine. However I think that public methods should be extracted 
into
interface to make implementation switching easier.

Original comment by allat...@gmail.com on 1 Apr 2009 at 11:11

GoogleCodeExporter commented 9 years ago
<<<
If one message can be queued for only one gateway at any given time - this
implementation is fine.
>>>
Sorry, I think I've missed you this. The Router logic returns one gateway, how 
can it 
be to queue one message to more than one gateways?

Original comment by T.Delenikas on 2 Apr 2009 at 6:11

GoogleCodeExporter commented 9 years ago

Original comment by ad...@smslib.org on 4 Apr 2009 at 1:00

GoogleCodeExporter commented 9 years ago

Original comment by T.Delenikas on 21 Apr 2009 at 9:08

GoogleCodeExporter commented 9 years ago

Original comment by T.Delenikas on 22 Apr 2009 at 6:40

GoogleCodeExporter commented 9 years ago

Original comment by T.Delenikas on 22 Apr 2009 at 7:11

GoogleCodeExporter commented 9 years ago
Hi, Thanasis.

I was going to spend some time to implement queue as one I described several 
years ago. Skimmed through revisions history and found that you made a switch 
back from central queue to per-gateway queues. What was the reason for this 
change?
And, BTW, since 3.5.0 there wasn't much of a change in smslib according to SVN 
rev history. Is it because smslib is now a feature-full or it's because there 
is something grand-new to expect?

Regards.

Original comment by allat...@gmail.com on 20 Feb 2011 at 11:18