Closed vladimir-mencl-eresearch closed 13 years ago
Might need to do that again at some stage and look through the backend logs. BTW, it might prove useful for you to use local backend for this kind of testing (if there are no firewall issues that is). That way you can monitor debug log for both front- and backend...
This might be improved with Yuriys changes to fix connection problem. Should be threadsafe now.
I don't think we'll be able to fix this one for the next milestone. We need Vlad back to do some controlled testing to figure out where exactly the problem lies.
I also created an upstream issue for batch job support in general: https://github.com/grisu/grisu/issues/21
Vlad hasn't used batch jobs, just lots of single jobs.
please use BeSTGRID-DEV backend for this testing :)
On 15/07/11 18:20, yuriyh wrote:
Vlad hasn't used batch jobs, just lots of single jobs.
Hi guys,
I'm now back.... and I would still be doing large operations (submitting numbers of single jobs, deleting numbers of completed jobs).
If someone's keen to watch the logs, I'm happy to coordinate when I run the operations - get me on Skype or Jabber.
Cheers, Vlad
Vladimir Mencl, Ph.D. E-Research Services and Systems Consultant BlueFern Computing Services University of Canterbury Private Bag 4800 Christchurch 8140 New Zealand
http://www.bluefern.canterbury.ac.nz mailto:vladimir.mencl@canterbury.ac.nz Phone: +64 3 364 3012 Mobile: +64 21 997 352 Fax: +64 3 364 3002
If someone's keen to watch the logs, I'm happy to coordinate when I run the operations - get me on Skype or Jabber.
please do use BeSTGRID-DEV as backend for this.
And yes, let me know when you want to submit job batches, so I can watch logs.
Cheers, Yuriy
Excerpts from reply+i-987666-beaad0227c108dc3e84c15d4643b6462e0e12fab's message of Tue Jul 19 17:12:18 +1200 2011:
On 15/07/11 18:20, yuriyh wrote:
Vlad hasn't used batch jobs, just lots of single jobs.
Hi guys,
I'm now back.... and I would still be doing large operations (submitting numbers of single jobs, deleting numbers of completed jobs).
If someone's keen to watch the logs, I'm happy to coordinate when I run the operations - get me on Skype or Jabber.
Cheers, Vlad
Vladimir Mencl, Ph.D. E-Research Services and Systems Consultant BlueFern Computing Services University of Canterbury Private Bag 4800 Christchurch 8140 New Zealand
http://www.bluefern.canterbury.ac.nz mailto:vladimir.mencl@canterbury.ac.nz Phone: +64 3 364 3012 Mobile: +64 21 997 352 Fax: +64 3 364 3002
Running one batch now ..... lots of submits, not expecting errors there
Submitted about 40 jobs (out of 1600) and stopped with:
Exception in thread "main" com.sun.xml.ws.server.UnsupportedMediaException: Unsupported Content-Type: text/html; charset=iso-8859-1 Supported ones are: [text/xml] at com.sun.xml.ws.encoding.StreamSOAPCodec.decode(StreamSOAPCodec.java:295) at com.sun.xml.ws.encoding.StreamSOAPCodec.decode(StreamSOAPCodec.java:129) at com.sun.xml.ws.encoding.SOAPBindingCodec.decode(SOAPBindingCodec.java:360) at com.sun.xml.ws.transport.http.client.HttpTransportPipe.process(HttpTransportPipe.java:187) at com.sun.xml.ws.transport.http.client.HttpTransportPipe.processRequest(HttpTransportPipe.java:94) at com.sun.xml.ws.transport.DeferredTransportPipe.processRequest(DeferredTransportPipe.java:89) at com.sun.xml.ws.api.pipe.Fiber.__doRun(Fiber.java:598) at com.sun.xml.ws.api.pipe.Fiber._doRun(Fiber.java:557) at com.sun.xml.ws.api.pipe.Fiber.doRun(Fiber.java:542) at com.sun.xml.ws.api.pipe.Fiber.runSync(Fiber.java:439) at com.sun.xml.ws.client.Stub.process(Stub.java:222) at com.sun.xml.ws.client.sei.SEIStub.doProcess(SEIStub.java:135) at com.sun.xml.ws.client.sei.SyncMethodHandler.invoke(SyncMethodHandler.java:109) at com.sun.xml.ws.client.sei.SyncMethodHandler.invoke(SyncMethodHandler.java:89) at com.sun.xml.ws.client.sei.SEIStub.invoke(SEIStub.java:118) at $Proxy41.submitJob(Unknown Source) at grisu.frontend.model.job.JobObject.submitJob(JobObject.java:1342) at grisu.frontend.model.job.JobObject.submitJob(JobObject.java:1297) at grisu.gricli.command.SubmitCommand.submit(SubmitCommand.java:49) at grisu.gricli.command.SubmitCommand.execute(SubmitCommand.java:73) at grisu.gricli.Gricli.runCommand(Gricli.java:142) at grisu.gricli.Gricli.run(Gricli.java:133) at grisu.gricli.Gricli.executionLoop(Gricli.java:93) at grisu.gricli.Gricli.main(Gricli.java:86)
(on BeSTGRID-DEV)
@smas036 would like to see a test plan for this issue, so that we can run some incremental tests to understand causes of this issue.
Sure I'll get onto that : )
Gridftp testing is a lot of fun, good luck with that Sina :-)
So many possible issues and so many layers that break every now and then. And of course you can hardly ever reproduce an error. I once even had to write a dedicated gridftp-test client: https://github.com/grisu/gridftp-tests
Not that much documentation on that since I only used it myself, but if you need a tool to up-/download/cross-stage files via gridftp parallel or lots of them in series, this might actually help a bit.
Probably needs a bit of maintenance first, though...
Closing this here, it's not really a Gricli issue. Progress in this (gridftp-errors) area can be tracked here:
https://github.com/grisu/grisu/issues/29
Turn on notifications on that if you are interested. Any suggestions on how to design a proper gridftp-test-framework are welcome...
I've just tried deleting 300 jobs with a single
clean job Batch-*
command.
The command has left 10 directories behind still sitting on the server, with some of them being empty and some still having some files left. No errors were reported.