cooperative-computing-lab / makeflow-examples

Example workflows for the Makeflow workflow system.
32 stars 18 forks source link

Lifemapper Example #27

Closed tshaffe1 closed 5 years ago

tshaffe1 commented 6 years ago

This is a cleaned up Lifemapper demo suitable for inclusion in the Makeflow examples. This version includes all the software pieces required as well as a shareable sample dataset. I removed the setup for nested workflows we used in the paper, as this introduces a fair amount of complexity and requires arranging $PATH or preparing precompiled makeflow for worker nodes. I also tried to make the dependency relationships a bit more clear. I've tested locally and with WQ+Condor following the README.

tshaffe1 commented 6 years ago

I almost forgot, this depends on cooperative-computing-lab/cctools#1948

dthain commented 6 years ago

Oh, I just noticed you have java8_run as a self-extracting archive. +1 for easy self-contained reproducibility -1 for big binaries in the archive.

Are there any legal restrictions on redistributing the JRE? Is there a not-awful way to indicate Java 8 as a dependency but not include it? Maybe some copy-pasta to download/install if needed?

tshaffe1 commented 6 years ago

I wasn't the one who obtained that Java runtime in the first place, so I'm not sure what restrictions are in place. I can rebase it out. Installing Java depends on the platform, so I'd be inclined to just link to the downloads page in case it's not already installed.

dthain commented 6 years ago

Hmm, I'm just wondering if there is anything "special" about that particular java installation.

tshaffe1 commented 6 years ago

Nothing particularly special. We only included that because some of the Condor pool doesn't have Java installed. It just tested it with the Java on my workstation, no problems.

dthain commented 6 years ago

When I run this, I see lots of errors:

Exception in thread "main" java.util.NoSuchElementException
    at java.util.StringTokenizer.nextToken(StringTokenizer.java:349)
    at density.Project.projectGrid(Project.java:152)
    at density.Project.doProject(Project.java:112)
    at density.Project.main(Project.java:522)

And:

Warning: Sample at 32.2275, -109.781389 in Conanthalictus_conanthi.csv is outside the bounding box of environmental data, skipping
Warning: Sample at 32.2275, -109.781389 in Conanthalictus_conanthi.csv is outside the bounding box of environmental data, skipping
Warning: Sample at 32.2275, -109.781389 in Conanthalictus_conanthi.csv is outside the bounding box of environmental data, skipping
Warning: Sample at 32.2275, -109.781389 in Conanthalictus_conanthi.csv is outside the bounding box of environmental data, skipping
Warning: Sample at 32.2275, -109.781389 in Conanthalictus_conanthi.csv is outside the bounding box of environmental data, skipping

Is that expected behavior? if so, it should be discussed in the README.

(Is this an opportunity to use wrappers to address the problem?)

Also, the dot image isn't as interesting as the previous one, was that generated using make_image.sh? Maybe use a larger example?

tshaffe1 commented 6 years ago

That's expected behavior. The failures are data dependent, so I don't think wrappers would help reducing the noise. I could look at using them for cleanup actions, though, as I currently just stuck extra bits on the ends of the commands.

I can generate an image of an expanded taxon like we used in the paper. I was just using the ones that actually run in this example.

tshaffe1 commented 6 years ago

I did a synthetically expanded taxon as in the paper. I also used make_image.sh and updated the banner.

dthain commented 6 years ago

Tim, we let this one slip, let's get it fixed up.

The Lifemapper part looks good to me know.

The only thing I notice now is that the images changed and they seem a little off -- the more dense images (hecil, bwa-gatk) are barely readable. Did something change in the image generation that we missed?

tshaffe1 commented 6 years ago

Huh, I get that same scale issue on master.... I wonder if those were generated using an older version of Graphviz? My workstation has v2.30.1

dthain commented 6 years ago

I have 2.40.1 on my mac and 2.30.1 in ccl software.

Can you check if one or the other gives better results?

(It's a small thing, but very highly visible to potential users...)

On Thu, Aug 30, 2018 at 10:25 AM Tim Shaffer notifications@github.com wrote:

Huh, I get that same scale issue on master.... I wonder if those were generated using an older version of Graphviz? My workstation has v2.30.1

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cooperative-computing-lab/makeflow-examples/pull/27#issuecomment-417338562, or mute the thread https://github.com/notifications/unsubscribe-auth/ACoBkt9BuiJt2riAmYtt6QPAOeJDUgRkks5uV_XggaJpZM4VgjNd .

tshaffe1 commented 6 years ago

I got 2.40.1 from cclimport, and same issue. Where were these generated originally?

tshaffe1 commented 6 years ago

It looks like the current graphic (for hecil at least) has far fewer nodes, so it makes sense that the newly generated ones are harder to read. Something is weird here....

tshaffe1 commented 6 years ago

So commit 9f5c771 changed hecil.mf without updating the image. The corresponding PNG hasn't been updated since a couple of months before that commit. I guess a few of the examples have gotten out of sync.

dthain commented 6 years ago

Aha, I see you are right.

Please see what you can do to make the images a little more pleasing, using the workflows as they are. Maybe it's just a matter of (not) antialiasing or downsampling or something like that in order to visualize.

On Thu, Aug 30, 2018 at 2:41 PM Tim Shaffer notifications@github.com wrote:

So commit 9f5c771 https://github.com/cooperative-computing-lab/makeflow-examples/commit/9f5c77106c0d3cdf7bbb75ea1c2b74ff697f51c3 changed hecil.mf without updating the image. The corresponding PNG hasn't been updated since a couple of months before that commit. I guess a few of the examples have gotten out of sync.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cooperative-computing-lab/makeflow-examples/pull/27#issuecomment-417425128, or mute the thread https://github.com/notifications/unsubscribe-auth/ACoBkoGP_HpULkd29E9koYJM8faHShWIks5uWDHHgaJpZM4VgjNd .

tshaffe1 commented 6 years ago

I can play with the image size, but those ones just have too many nodes to view without zooming and scrolling. When scaled down for the banner, we'd have the same problem. For the moment, I rebased the other examples out of this PR.