grailbio / reflow

A language and runtime for distributed, incremental data processing in the cloud
Apache License 2.0
965 stars 52 forks source link

remote error: tls: bad certificate - for reasonable requests #92

Closed olgabot closed 5 years ago

olgabot commented 5 years ago

Hello, I'm trying to run a relatively simple bedtools.rf workflow (in "Details" below) with @AMaynard10 and can't seem to get ANY of the jobs through.

``` bedtools := "quay.io/biocontainers/bedtools:2.27.0--he941832_2" files := make("$/files") // / # bedtools coverage // Tool: bedtools coverage (aka coverageBed) // Version: v2.27.0 // Summary: Returns the depth and breadth of coverage of features from B // on the intervals in A. // Usage: bedtools coverage [OPTIONS] -a -b // Options: // -hist Report a histogram of coverage for each feature in A // as well as a summary histogram for _all_ features in A. // Output (tab delimited) after each feature in A: // 1) depth // 2) # bases at depth // 3) size of A // 4) % of A at depth // -d Report the depth at each position in each A feature. // Positions reported are one based. Each position // and depth follow the complete A feature. // -counts Only report the count of overlaps, don't compute fraction, etc. // -mean Report the mean depth of all positions in each A feature. // -s Require same strandedness. That is, only report hits in B // that overlap A on the _same_ strand. // - By default, overlaps are reported without respect to strand. // -S Require different strandedness. That is, only report hits in B // that overlap A on the _opposite_ strand. // - By default, overlaps are reported without respect to strand. // -f Minimum overlap required as a fraction of A. // - Default is 1E-9 (i.e., 1bp). // - FLOAT (e.g. 0.50) // -F Minimum overlap required as a fraction of B. // - Default is 1E-9 (i.e., 1bp). // - FLOAT (e.g. 0.50) // -r Require that the fraction overlap be reciprocal for A AND B. // - In other words, if -f is 0.90 and -r is used, this requires // that B overlap 90% of A and A _also_ overlaps 90% of B. // -e Require that the minimum fraction be satisfied for A OR B. // - In other words, if -e is used with -f 0.90 and -F 0.10 this requires // that either 90% of A is covered OR 10% of B is covered. // Without -e, both fractions would have to be satisfied. // -split Treat "split" BAM or BED12 entries as distinct BED intervals. // -g Provide a genome file to enforce consistent chromosome sort order // across input files. Only applies when used with -sorted option. // -nonamecheck For sorted data, don't throw an error if the file has different naming conventions // for the same chromosome. ex. "chr1" vs "chr01". // -sorted Use the "chromsweep" algorithm for sorted (-k1,1 -k2,2n) input. // -bed If using BAM input, write output as BED. // -header Print the header from the A file prior to results. // -nobuf Disable buffered output. Using this option will cause each line // of output to be printed as it is generated, rather than saved // in a buffer. This will make printing large output files // noticeably slower, but can be useful in conjunction with // other software tools and scripts that need to process one // line of bedtools output at a time. // -iobuf Specify amount of memory to use for input buffer. // Takes an integer argument. Optional suffixes K/M/G supported. // Note: currently has no effect with compressed files. // Default Output: // After each entry in A, reports: // 1) The number of features in B that overlapped the A interval. // 2) The number of bases in A that had non-zero coverage. // 3) The length of the entry in A. // 4) The fraction of bases in A that had non-zero coverage. func Coverage(reference, cell_bam file) = exec (image := bedtools, mem := 2*GiB) (output file) {" bedtools coverage -a {{reference}} -b {{cell_bam}} > {{output}} "} val Main = { cell_bam := file("s3://darmanis-group/singlecell_lungadeno/non_immune/nonImmune_bams_9.27/170125/G12_1001000292/G12_1001000292_S72.homo.Aligned.out.sorted.bam") reference := file("s3://darmanis-group/singlecell_lungadeno/non_immune/non_immune_bedtools/refs/BRAF_KRAS_EGFR_sorted.bed") output := "s3://darmanis-group/singlecell_lungadeno/non_immune/non_immune_bedtools/outputs/G12_1001000292.coverage.txt" coverage := Coverage(reference, cell_bam) files.Copy(coverage, output) } ```

Here's the output from reflow run, which keeps getting a network error and "bad certificate:"

 Wed 28 Nov - 15:19  ~/code/reflow-workflows   origin ☊ olgabot/bedtools 1⚙ 1● 
  reflow run bedtools.rf     
reflow: run ID: 3b527e93
reflow: ec2cluster: error while waiting for offers: offers ec2-18-237-51-122.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-18-237-51-122.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-35-167-37-93.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-35-167-37-93.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-35-167-37-93.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-35-167-37-93.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-52-40-64-53.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-52-40-64-53.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-52-40-64-53.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-52-40-64-53.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-54-189-82-93.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-54-189-82-93.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-54-189-82-93.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-54-189-82-93.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-34-216-195-169.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-34-216-195-169.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-34-216-195-169.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-34-216-195-169.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-34-211-39-196.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-34-211-39-196.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-34-211-39-196.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-34-211-39-196.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-54-187-47-140.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-54-187-47-140.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-54-187-47-140.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-54-187-47-140.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-34-219-99-230.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-34-219-99-230.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-34-219-99-230.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-34-219-99-230.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-52-35-166-218.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-52-35-166-218.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-52-35-166-218.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-52-35-166-218.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-18-237-162-121.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-18-237-162-121.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
reflow: ec2cluster: error while waiting for offers: offers ec2-18-237-162-121.us-west-2.compute.amazonaws.com:9000: network error: Get https://ec2-18-237-162-121.us-west-2.compute.amazonaws.com:9000/v1/offers%2F: remote error: tls: bad certificate
ec2cluster: 0 instances:  (<=$0.0/hr), total{}, waiting{mem:2.0GiB cpu:1 disk:1.0GiB}, pending{mem:3.7GiB cpu:2 disk:250.0GiB intel_avx512:2}
  allocate {mem:2.0GiB cpu:1 disk:1.0GiB}:  provisioning new instance                  49m28s
  i-05f695bddc8609e65:                      waiting for reflowlet to become available  2m4s

What's weird is that there are instances that are running and initialized from this computer:

screen shot 2018-11-29 at 9 21 44 am

So I'm not sure why they're not getting connected to, or running anything.

Here's my reflow config:

assoc: dynamodb,czbiohub-reflow-quickstart
cluster: ec2cluster
ec2cluster:
  ami: ami-4296ec3a
  cloudconfig: {}
  diskslices: 0
  diskspace: 250
  disktype: gp2
  instancetypes:
  - c1.medium
  - c1.xlarge
  - c3.2xlarge
  - c3.4xlarge
  - c3.8xlarge
  - c3.large
  - c3.xlarge
  - c4.2xlarge
  - c4.4xlarge
  - c4.8xlarge
  - c4.large
  - c4.xlarge
  - c5.large
  - c5.xlarge
  - c5.2xlarge
  - c5.4xlarge
  - c5.9xlarge
  - c5.18xlarge
  - cc2.8xlarge
  - m1.large
  - m1.medium
  - m1.small
  - m1.xlarge
  - m2.2xlarge
  - m2.4xlarge
  - m2.xlarge
  - m3.2xlarge
  - m3.large
  - m3.medium
  - m3.xlarge
  - m4.16xlarge
  - m4.4xlarge
  - m4.xlarge
  - m4.16xlarge
  - m5.24xlarge
  - r4.xlarge
  - r4.16xlarge
  - t1.micro
  - t2.large
  - t2.medium
  - t2.micro
  - t2.nano
  - t2.small
  keyname: ""
  maxinstances: 20
  region: us-west-2
  securitygroup: sg-661d7f19
  spot: true
  sshkey: |
    ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDdE/ryKjTlTome/82Bx68aVaeEo3p+KkaKtIWRSmbHmd2HlLcEt56+9xzYuhYtQPIqZqKLI682HcUqNW9ApnTCK6uSbqGaaAQ/d6otnnEIL3Q+mLMzyA9ABZ5iX5Ny8M0uEuHLRblH7j/YIqbXKnXenCgC++4n+R/8sLYvULeTihdlCJj7C6FL5koyF+U0gTce3qcarpvdNNxnBgPor7cfRwzaF5co82xCB44Zoz76r6uDg9b9AYQPZyYaNuJCc7csZPpSUsrKj9G7Dcnsis/2k2BTc5yvMeF+sr5IKq2uZpkWQuaheSBkIMT9KelSvjtmAritTpPgxYGoVyEp3bphn1iMdxOLceCcu5ldj0TTgVdTrEb3A6Nvl4RZ3nxWrx5oNfKoLXo5wY/vrJxBwO3eZQVSQ27FQ2Foh86sA6eYEKD4c5cFNX09myb74JJp1ATn6FHdDWXx3d40pQKAP4CjqBo9TvcsXXJxWYQQ12CxrOxmyq55Suf/RKpRVsw4dO6xJBPilKGok31s9ePPik3ASmzz2nAOV+aQIFESQ5EcC7CKICUtsdjc6hxHrrzzf20D3pfcMJ9fZuNsSEGOVUBJqe+kHADlYMuUjgbS4dQykSJ2oDP/C5PQOstyVthm3t6YmR+VUsxvzwbu8EhNGDdVP60nOZK54aSVvRyft5vvww== olga.botvinnik@gmail.com
https: httpsca,/Users/olgabot/.reflow/reflow.pem
repository: s3,czbiohub-reflow-quickstart-cache

Thank you! Warmest, Olga

mariusae commented 5 years ago

Can you try to delete /Users/olgabot/.reflow/reflow.pem?

olgabot commented 5 years ago

Yes, that worked! thank you!

olgabot commented 5 years ago

Why did that work?

mariusae commented 5 years ago

The default expiration time for the certificate authority created by Reflow was too short. I've just submitted a fix internally for this. (It's not an issue internally because we use a different CA.) Should show up here soon.

olgabot commented 5 years ago

Ah I see. Good to know! Thank you!