lemurheavy / coveralls-public

The public issue tracker for coveralls.io
http://coveralls.io
124 stars 7 forks source link

Parallel build failure #1093

Closed chrisblossom closed 6 years ago

chrisblossom commented 6 years ago

I've been trying to setup parallel builds with CircleCI without success. Please advise what I'm doing wrong. Thank you!

When using the recommended CircleCI setup, the following error is displayed: webhook returned http status 500, connecting to https://coveralls.io/webhook?repo_token=COVERALLS_TOKEN_IS_HERE

When using the curl method, the following happens:

#!/bin/bash -eo pipefail
curl -k https://coveralls.io/webhook?repo_token=$COVERALLS_REPO_TOKEN -d "payload[build_num]=$CIRCLE_BUILD_NUM&payload[status]=done"
<!DOCTYPE html>
<html lang='en'>

  <head>
    <meta charset='utf-8'>
    <meta content='IE=Edge,chrome=1' http-equiv='X-UA-Compatible'>
    <title>Coveralls :: Oops, something went wrong (500)</title>

    <link type="text/css" rel="stylesheet" media="screen" href="/pub.css">
    <link href="//netdna.bootstrapcdn.com/font-awesome/3.2.1/css/font-awesome.min.css" rel="stylesheet">
    <script src="http://www.google.com/jsapi" type="text/javascript"></script>
    <script type="text/javascript" src="//use.typekit.net/ngz2htl.js"></script>
    <script type="text/javascript">try{Typekit.load();}catch(e){}</script>

  </head>

  <body id='errorPage' class='errorAlert'>
    <div class='container'>
      <div class='container_footer'>

        <div class="fixedWr">
          <a class="logo" href="http://coveralls.io">Coveralls</a>
          <div class="fixedMenu">
            <a href="http://docs.coveralls.io">Docs</a>
            <a href="/features">Features</a>
            <a href="/enterprise">Enterprise</a>
            <a target="_blank" href="http://blog.coveralls.io/">Blog</a>
            <a rel="nofollow" class="btn btn-white" href="/authorize/github">Sign In</a>
          </div>
        </div>

        <div class='errorContent'>
          <img src="https://s3.amazonaws.com/assets.coveralls.io/assets/public_stomp-d9260221fff7d9ead77ff4106482e787.png" />
          <h1>
            Five Zero Zero
          </h1>
          <div class='errorMessage'>
            <h2>It's Gone Pear Shaped</h2>
            <p>There has been a hiccup, the revolution will continue after we stamp out the refusniks. You could also try refreshing or returning to the home page.</p>
            <p>Also, you might want to check out the Coveralls Status page to see if there is a system outage: <a href="http://status.coveralls.io" target="_blank">http://status.coveralls.io</p>
          </div>

          <br><br>

          <div class="twitter-feed">
            <a class="twitter-timeline" data-dnt="true" href="https://twitter.com/CoverallsApp" data-widget-id="380835385749667840">Tweets by @CoverallsApp</a>
            <script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+"://platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs");</script>
          </div>
        </div>

      </div>
    </div>

    <footer id="footer" class="">
    <div class="wrapper">
    <a href="/users">Admin</a>
    ·
    <a href="http://docs.coveralls.io/troubleshooting">Troubleshooting</a>
    ·
    <a href="https://github.com/lemurheavy/coveralls-public/issues">Open an Issue</a>
    ·
    <a rel="nofollow" href="mailto:sales@coveralls.io">Sales</a>
    ·
    <a rel="nofollow" href="mailto:support@coveralls.zendesk.com">Pro Support</a>
    ·
    <a rel="nofollow" href="https://coveralls.zendesk.com/hc/en-us/requests/new">FEEDBACK</a>
    ·
    <a href="https://enterprise.coveralls.io">ENTERPRISE</a>
    ·
    <a href="/careers">CAREERS</a>
    <br>
    <span>
    <a target="_blank" href="http://blog.coveralls.io/">BLOG</a>
    ·
    <a target="_blank" href="https://twitter.com/CoverallsApp">TWITTER</a>
    ·
    <a href="/legal">Legal &amp; Privacy</a>
    ·
    <a href="/supported-continuous-integration">Supported CI Services</a>
    ·
    <a href="/continuous-integration">What's a CI service?</a>
    ·
    <a href="/code-coverage">Automated Testing</a>
    </span>
    <p>
    &copy; 2016 <a rel="nofollow" href="http://lemurheavy.com">Lemur Heavy Industries</a>
    </p>
    </div>
    </footer>

  </body>
</html>
NoyaArie commented 6 years ago

I have the same problem :/ Any updates on this?

noyfactor commented 6 years ago

👍

joroshiba commented 6 years ago

@chrisblossom @NoyaArie @noyg which repo's are you currently having issue with this on. I can investigate.

I do also know we have some improvements for this endpoint coming out soon (including more useful error messages), but I'd be happy to check and see if there is anything else going wrong and help you all get this fixed.

chrisblossom commented 6 years ago

Thanks for the reply @bytewalls. Here is one of the commits that was giving me this error: https://github.com/chrisblossom/resolve-with-prefix/commit/e0883bc164888bcb0261d5be10e4ca2e1312fb35

NoyaArie commented 6 years ago

@bytewalls , It's a private company repo. It derives from CircleCI's parallel build, in coveralls app we get all the three results, but we need to merge the results.

joroshiba commented 6 years ago

@NoyaArie We have deployed a fix for issues some people were seeing with parrelel builds, sorry for the inconvenience.

@chrisblossom the webhook should be added such that it runs after all parralel builds have completed. I have put in an inquiry with CircleCI about how this has changed with their 2.0 builds so we can update documentation appropriately if needed. In the meantime, we did release an update so you will hopefully at least see more useful errors.

joroshiba commented 6 years ago

@chrisblossom just confirmed with CircleCI that their 2.0 builds support the same infrastructure, our documentation here should help get that webhook properly configured.

chrisblossom commented 6 years ago

@bytewalls I don't think this issue should be closed. I originally setup the parallel builds using CircleCI's webhook feature (which is not 100% supported according to their documentation/forum posts). It failed the same way, so I used the manual way shown above.

I'll try it again as soon as I can and report back the new error.

chrisblossom commented 6 years ago

@bytewalls the CircleCI webhook is still not working for Coveralls, and is no longer providing a status message after the Coveralls webhook response update. CircleCI's webhook feature is working, tested via https://webhook.site/#/ebadd7c8-644c-461f-825d-7ad8c1514877/df7e47c0-acc9-4001-97cd-85f0277a3865/1

https://circleci.com/gh/chrisblossom/resolve-with-prefix/80#config/containers/0 https://github.com/chrisblossom/resolve-with-prefix/commits/coveralls_webhook https://circleci.com/gh/chrisblossom/resolve-with-prefix/tree/coveralls_webhook

When using the curl method, the response I am now getting: {"error":"No build matching CI build number 71 found"} via command: curl -k https://coveralls.io/webhook?repo_token=$COVERALLS_REPO_TOKEN -d "payload[build_num]=$CIRCLE_BUILD_NUM&payload[status]=done".

Please advise which CircleCI build number is correct.

joroshiba commented 6 years ago

@chrisblossom , thanks for your patience, didn't intend to close the issue before things were resolved.

It looks as though the webhook endpoint is performing correctly, but the initial builds are not capturing the build number from circle. We will investigate this and get back to you.

NoyaArie commented 6 years ago

@bytewalls I have the same issue as @chrisblossom ({"error":"No build matching CI build number 71 found"})

joroshiba commented 6 years ago

@NoyaArie which CI service are you using? Can you point me to the repo on coveralls this is affecting?

Thanks!

chrisblossom commented 6 years ago

@bytewalls thanks for reopening!

Any thoughts why curl -k https://coveralls.io/webhook?repo_token=$COVERALLS_REPO_TOKEN -d "payload[build_num]=$CIRCLE_BUILD_NUM&payload[status]=done" isn't working? I'm confused because it looks like this is properly setup according to the docs.

NoyaArie commented 6 years ago

@bytewalls It's a private company repo, and we use CircleCI

joroshiba commented 6 years ago

@chrisblossom @NoyaArie we deployed some improvements to our parallel build process that should solve improve this and solve the CircleCi integration issue we were having there.

@chrisblossom the curl command wasn't working for the same reason the webhook wasn't, the build_number was not being captured correctly on the initial reports from circle with builds using the Circle 2.0 framework. Also just want to note that it looks like your jobs are not happening in parrallel since you don't have enough containers to run both under one build on circleci

NoyaArie commented 6 years ago

@bytewalls I still getting the same error {"error":"No build matching CI build number 71 found"}. Also, now I don't see the last build on my repo page in coveralls app. I have the link to my build in coveralls from the CircleCI build output, but I don't see this build on my repo page (yesterday it did work).

joroshiba commented 6 years ago

@NoyaArie which private repo is this on?

NoyaArie commented 6 years ago

@bytewalls , WeConnect/store

NoyaArie commented 6 years ago

@bytewalls , After further tests I have few conclusions:

Do you have a solution for the first one or an idea why the second one happens? Thank you for the help

joroshiba commented 6 years ago

@NoyaArie interesting, for the first, are your jobs on circle running parallel on separate builds or are they running in parallel as two builds under the same container?

I've noticed that Circle 2.0 will run the builds separately under different build numbers when run, but I believe if there are two containers available where they run under the same build they will have the same number and only one webhook call.

Following that same logic, if the build is running in two separate 'builds' entirely as opposed to two parallel containers, that will result in the webhook firing twice, resulting in the total build coverage being calculated after one test run instead of after the two have completed.

What is the build number showed to you in coveralls for the run which has the incorrect data? I can look at the backend and see what data we have received there and what the issue might be with the total calculation.

NoyaArie commented 6 years ago

@bytewalls , The first job is running parallel, this job is for the build and specs, the second job is the marge. The second job starts only after the first job ends (all the parallel parts). There is only one workflow number, but each job as the different build number. Just to be more clear:

Does there an option to send to coveralls the result by circleCi workflow number and not by build number?

chrisblossom commented 6 years ago

@bytewalls Sorry it has taken me so long to look back into this, and thank you for your continued help. It does indeed look like the build number is getting processed correctly now. But it looks like there is still no way (at least in open source) to merge builds if they do not have same build number.

Is it possible for me to tell coveralls to expect a specific number of builds (in my case 2) and send a git hash as a build number? This seems like it would solve @NoyaArie issue as well.

joroshiba commented 6 years ago

@chrisblossom it looks as though the problem you have is that the builds are not running parralel, and are not under the same build number on circleci because of that. Perhaps they have the same workflow number?

We have started a team discussion how to integrate the circleci workflow numbers into our system. Unfortunately it appears as though they are not reporting this number in their webhook call, so as of current any implementation will require some manual work.

As for reporting the number of builds, we have discussed this before, but with no integrated means to do prebuild webhooks, and a desire to make things as plug and play as possible with various CI's decided on our current approach, we may revisit this in the future.

chrisblossom commented 6 years ago

@bytewalls I guess I wasn't very clear, or maybe I'm not understanding how to merge builds when they are not run in parallel / not on the same CI host.

I use CircleCI and AppVeyor to test both *nix and windows support (which I think is very typical). How can I merge builds across multiple CI vendors? Otherwise, what is the point of merging if they can't be merged on different environments?

const os = require('os');
function example() {
    const platform = os.platform();
    if (platform === 'win32') {
        /**
         * Windows is tested on AppVeyor (appveyor.com)
         *
         * Will show uncovered on CircleCI
         */
    } else {
        /**
         * *nix is tested on CircleCI (circleci.com)
         *
         * Will show uncovered on AppVeyor
         */
    }
}

We have started a team discussion how to integrate the circleci workflow numbers into our system. Unfortunately it appears as though they are not reporting this number in their webhook call, so as of current any implementation will require some manual work.

From my experience, using CircleCI's webhook doesn't give any feedback / isn't officially supported. It is better to use - run: curl -k https://coveralls.io/webhook?repo_token=$COVERALLS_REPO_TOKEN -d "payload[build_num]=$CIRCLE_BUILD_NUM&payload[status]=done" at the end of each workflow.

Regardless, why rely on CircleCI's numbers when a git hash is consistent across environments?

As for reporting the number of builds, we have discussed this before, but with no integrated means to do prebuild webhooks, and a desire to make things as plug and play as possible with various CI's decided on our current approach, we may revisit this in the future.

Why couldn't that just be sent with the post build webhook? Or even better, automatically merge results as they come in that have the same git hash (instead of using build number), so you wouldn't need to know how many builds to expect.

chrisblossom commented 6 years ago

@bytewalls Just wondering if you saw this last message? If it isn't something you guys are wanting to support that is okay, please just let me know. Should this be a separate issue?