nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.7k stars 621 forks source link

Name or service not known error when running on EC2 #409

Closed mhalagan closed 7 years ago

mhalagan commented 7 years ago

Running this command on the master node

./nextflow run nmdp-bioinformatics/flow-Optitype \
    --with-docker nmdpbioinformatics/flow-OptiType \
    --outfile hli-optitype.csv \
    --bamdir s3://bucket/s3/data \
    --datatype dna

Returns the following errors:

N E X T F L O W  ~  version 0.25.3-SNAPSHOT
Pulling nmdp-bioinformatics/flow-Optitype ...
 downloaded from https://github.com/nmdp-bioinformatics/flow-OptiType.git
Launching `nmdp-bioinformatics/flow-Optitype` [sleepy_jang] - revision: 6fcb330fe1 [master]

---------------------------------------------------------------
NEXTFLOW OPTITYPE
---------------------------------------------------------------
Input BAM folder   (--bamdir)          : s3://bucket/s3/data
Sequence data type (--datatype)        : dna
Output file name   (--outfile)         : hli-optitype.csv

[warm up] executor > ignite
ERROR ~ ip-xx-xxx-xx-xxx: ip-xx-xxx-xx-xxx: Name or service not known

 -- Check script 'main.nf' at line: 68 or see '.nextflow.log' file for more details
mhalagan commented 7 years ago

nextflow.txt

pditommaso commented 7 years ago

It seems it't not able to resolve the domain name

Caused by: java.net.UnknownHostException: ip-10-223-13-100: Name or service not known
    at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
    at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:922)
    at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
    at java.net.InetAddress.getLocalHost(InetAddress.java:1492)

What distribution (AMI) are you using ?

mhalagan commented 7 years ago

That's the private IP, so I'm guessing it needs to be accessing the public IP address instead? I terminated the instance.

mhalagan commented 7 years ago

I'm running the cluster from within a VPC and I provided the nextflow.config with a subnet and security group ID.

pditommaso commented 7 years ago

yes, it should be able to resolve the IP. I'm doing a test.

mhalagan commented 7 years ago

I'm using an ubuntu AMI that I created.

pditommaso commented 7 years ago

I think this is more a AWS network config problem. Try to have a look here.

mhalagan commented 7 years ago

That makes sense. I'll try that and let you know if it works. Thanks!

mhalagan commented 7 years ago

It looks like it's failing because it's on an ubuntu AMI. On the ubuntu AMI you can't ping the hostname without adding .ec2.internal at the end. It would probably work better if it got the IP address instead of the DNS.

pditommaso commented 7 years ago

It looks weird, I've tested this with Ubuntu in the past. Also because what it's failing a plain Java API ie.

InetAddress.getLocalHost().getHostAddress()

It looks very strange that couldn't be run on Ubuntu.

mhalagan commented 7 years ago

Actually nevermind, I don't think it's an issue with using Ubuntu. It was an issue with the VPC I was using. The AWS default VPC resolves an instance DNS to ip-address.ec2.internal, but I was using one that was modified to return something different at the end. I think it'd be a good idea to use the IP address instead of the DNS, because that's more reliable.

pditommaso commented 7 years ago

The API is fetching the IP address, but for some reason it tries to resolve the host name. This means you have solved the issue?

pditommaso commented 7 years ago

I assume this is solved. Feel free to comment or re-open if necessary

mhalagan commented 7 years ago

Yeah, this has been solved. Sorry, for the delayed response.

pditommaso commented 7 years ago

Could you please provide a short description of the solution that could be useful for other users?

mhalagan commented 7 years ago

Go the the AWS VPC page and click on "DCHP Option Sets". Make sure the one you are using is being resolved to the default, which is "ec2-internal".