google-code-export / nmrrestrntsgrid

Automatically exported from code.google.com/p/nmrrestrntsgrid
0 stars 0 forks source link

Split the system into "public" and "processing", move processing to new server #156

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago

Docr Java and python processes currently use close to 200% CPU on tang
(2xCPU). This contributes to a whole bunch of problems, from slow response
time on public servlet to scp failures. So,

1) processing should not be done on public server. I'm pretty sure our
current webservers can handle another mysql DB and tomcat with one servlet.

2) there's a new 8-core (2x4) machine w/ 1TB mirrored raid to be used as
restraints processing computer. The only question now is whether to keep it
on Fedora 9 or install CentOS 5.2 on it, otherwise it's ready to go.

Original issue reported on code.google.com by dmitri.m...@gmail.com on 16 Jan 2009 at 8:17

GoogleCodeExporter commented 9 years ago
Can you comment on the scp failures? Do you mean issue 143?

I monitor the response times and they're acceptable in the way that they are 
comparable to when the server 
doesn't have the extra load. During the extra load such as the generation of 
images and doing the backup. The 
load is still acceptable but I keep an eye on it.

I've adjusted the priority to medium as this will have to wait for some time. 
You can already send me the info on 
the move as you have it planned. Perhaps it won't be as big a deal as I am 
afraid of;-)

Original comment by jurge...@gmail.com on 20 Jan 2009 at 11:23

GoogleCodeExporter commented 9 years ago
It's time consuming to split server and processing machine.
It's also unnecessary as the load is so limited. Only <100 per day requests by 
Google analytics. I just added 
you to the user list there, take a look.

That said, I'm ready to start migration as the 8 cores are just too tempting;-)

Who & what else is going to be on the machine besides this project?

I presume that the nmrrestrnts.bmrb.wisc.edu name will be transferred to it in 
due time?

Original comment by jurge...@gmail.com on 20 Jan 2009 at 1:04

GoogleCodeExporter commented 9 years ago
Nothing else's supposed to be there. 

Splitting is not about the load, it's about security and testing. So restraints 
stay
internal until we copy it public server, and if something breaks we don't 
publish the
broken stuff.

I'm not sure how it's supposed to work once we start processing deposited 
restraint
files (as opposed to mr files sent to us by PDB), but if restraints get 
processed and
published before main entry is released, I expect that might be a bad thing.

It's also not ideal that we have a public server (tang) on our intranet. We now 
have
two webservers in a failover configuration, and they live in a completely 
separate
subnet, with no user accounts, no access to nfs shares, and so on. I'm sure 
they can
handle tomcat and < 100 requests/day -- and you get high-availability, software 
updates
with no downtime and so on.

If we can mirror the database/servlet to public servers, nmrrestraints hostname 
will
point to www.

Original comment by dmitri.m...@gmail.com on 20 Jan 2009 at 10:36

GoogleCodeExporter commented 9 years ago
Re: scp failures: scp encrypts everything, that takes CPU cycles. With load 
hitting
100% on both CPUs, the cycles aren't available. Which may or may not be 
contributing to
#143 problem.

Original comment by dmitri.m...@gmail.com on 20 Jan 2009 at 11:51

GoogleCodeExporter commented 9 years ago
Email me and Wim, details about the root access to this machine and I'll get 
started on moving things over.

The reprocessing of files is going pretty slow which is a damn good reason to 
start on this. I had love to see 
the 8 cores hum;-)

Where would you suggest putting the db? It gets hit by processing and servlet 
server. It's only a couple of Mb 
of data in the whole db though. The actual data resides on the file system and 
I am thinking of putting this on 
the processing side. Then, can the server mount the partition(s) from the 
processing side?

Write up your ideas please.

Original comment by jurge...@gmail.com on 22 Jan 2009 at 6:58

GoogleCodeExporter commented 9 years ago
No, I'd prefer a complete mirror of db and data, with no cross-mounts between 
public
and private machines. We can mirror it with rsync the same way we mirror website
databases: csv dumps on private side, rsync daemon with a post-xfer script that 
loads
db from csvs on public side.

As for root access, ssh to lionfish, from there ssh to grunt, then run "sudo su 
-"
(works for both jurgen and wim accounts).

Original comment by dmitri.m...@gmail.com on 22 Jan 2009 at 10:30

GoogleCodeExporter commented 9 years ago
Cool, I am getting started to mirror tang's setup there. I think it's fine to 
stay on FC9. Might not be a bad idea to 
be a major version behind.

Can you setup the mirroring of the to be installed directories, db, and tomcat 
servlet engine?

For the servlet engine you'll have to give me configuration details so that I 
can adjust them in the Wattos.war.

Taking Eldon off the list as this gets too detailed.

Original comment by jurge...@gmail.com on 23 Jan 2009 at 9:07

GoogleCodeExporter commented 9 years ago
(I forgot how much I hate tomcat) Here's the deal: tomcat 5 does not work on 
fedora 9.
Our choices are: 
1. stay with fedora and use tomcat 6. I have no idea if it'll break anything in 
your
servlet.
2. switch to CentOS 5.2: tomcat 5 works on it.

I installed tc6 on grunt, configs are in /etc/tomcat6, webapps in 
/raid/www/webapps.
Drop your servlet in there and see if it runs... Mysql data dir is /raid/mysql.

I can set up the mirroring of data files and db, once you tell me the details: 
what
files, where, db schema (I'm pretty sure I can dump it from mysql somehow) etc.

Original comment by dmitri.m...@gmail.com on 24 Jan 2009 at 1:22

GoogleCodeExporter commented 9 years ago
Also, grunt:/raid is mounted on tang as /grunt.
And tomcat-apache redirection seems to work: http://grunt/tomcat takes you to 
the same
page as http://grunt:8080/

I'll probably need to open up the firewalls so you can see it from there -- or 
you can
use ssh's dynamic port forwarding. Let me know.

Original comment by dmitri.m...@gmail.com on 24 Jan 2009 at 1:35

GoogleCodeExporter commented 9 years ago
So considering issue 165 I leave the initiative for this issue with you.

Original comment by jurge...@gmail.com on 26 Jan 2009 at 2:10

GoogleCodeExporter commented 9 years ago
Ok, let's keep the issue here now.

I have no reason to believe my code would fail on the newer tomcat but I 
haven't tried it out yet. Since the code 
is very small it would be easy to update for any backwards compatibility issues 
you might find. 

Original comment by jurge...@gmail.com on 26 Jan 2009 at 8:15

GoogleCodeExporter commented 9 years ago
Famous last words. I'm more worried about openjdk, though -- I see some weird 
stuff
with validator UI. Anyway, if there are any compatibility issues you'll have to 
update
the code sooner or later so it might as well be now.

Original comment by dmitri.m...@gmail.com on 26 Jan 2009 at 8:34

GoogleCodeExporter commented 9 years ago
OK, so what do I need to install to test the servlet and Java code?

Chris knows what's involved in processing, but that's not what I need. Is there 
a
list of modules/projects with explanation of what they do and where to get them?
What's cinq, nmrrestraintsgrid, ccpn, and how it's different from wattos?

If you want to me to do this, you'll have to tell me what "this" is. Right now 
I have
no idea where to start.

Original comment by dmitri.m...@gmail.com on 27 Jan 2009 at 6:36

GoogleCodeExporter commented 9 years ago
On second thought, apparently we don't care about any of this right now. So I'm 
not
going to doing any of this anytime soon.

Original comment by dmitri.m...@gmail.com on 27 Jan 2009 at 8:39

GoogleCodeExporter commented 9 years ago
No problem.

Original comment by jurge...@gmail.com on 28 Jan 2009 at 10:04