Closed jwallen closed 11 years ago
It appears that fame and frankie are working, in that they can be run standalone from the command-line to give the correct output. This means that it is just GATPFit that is at issue. Looking more at GATPFit, I find that it fails at the first read() statement with an end-of-file condition. It is not immediately apparent to me what is different about how GATPFit reads data from stdin, but I will keep looking.
Just to be sure, I tried with both Windows and Unix line endings, with no effect.
Oddly enough, it seems like my antivirus/firewall was the problem all along. As soon as I turned it off, everything ran fine. However, things are still not working for Shamel. We're still looking into this.
I've written a minimal working example of a Java code that communicates with a persistent Fortran thread, based on the GATPFit and frankie implementations. I think I have some grasp of how it works now, although I'm still not sure if we're using the best set of Reader and Writer classes in the Java portion. My toy example also does not run for Shamel, so at least we seem to have isolated the problem. See https://gist.github.com/4035000 for the code.
Also, have we modified fame or DASSL to use avoidfork yet? It seems like we need to convert those in order to actually see the benefits of avoidfork from a memory doubling standpoint. However, that seems like a pretty significant change to make this close to a release. Was there a reason they weren't done before?
The original motivation (from @ramanan) was to avoid the CPU time overhead of forking, rather than to solve the memory doubling issue, so he profiled and did the slowest one first.
To avoid the memory doubling, yes, we need to avoid ALL forking. However, I set about implementing them one at a time and never got around to doing them all. I thought DASSL looked like a particularly tough nut to crack, so moved on. (One way to do it would be to have a separate thread running a "dassl server" in java, that does fork to spawn dassl jobs, but that only has a small memory footprint so doesn't double much memory) ....but certainly not this close to a release!
We have now fixed things for Shamel, and it works for Connie on Windows as well. We're not entirely sure what fixed it for Shamel; today we installed ant and the most recent JDK (these aren't needed if you are using Eclipse, which Shamel is). Now it works even in Eclipse.
Is there any advice to add to developer installation/compilation instructions? "Please try installing Ant and the most recent JDK"? And/or "Please turn off firewalls" for user instructions?
It looks like the advice is to (1) use the official JDK (preferably the most recent version*) to build RMG, and not rely on whatever is bundled with Eclipse, and (2) make sure all of the executables are whitelisted by your firewall so that they run without interference. I will add these to the installation instructions.
*Due to significant vulnerabilities in Java itself, not necessarily due to any issues with RMG.
When I try to run the minimal example on Windows (XP, 7), I get the following error when it attempts to run GATPFit for the first species (C2H6):
Running GATPFit directly from the command line and passing the saved
GATPFit/INPUT.txt
on stdin gives no errors, but no output either.As far as I know, no one has tried to run RMG on Windows since the avoidfork branch was merged in https://github.com/GreenGroup/RMG-Java/pull/257. This will need to be fixed before RMG 4.0, since many to most of our end users (and some of our developers!) are Windows users.