alibaba / spring-cloud-alibaba

Spring Cloud Alibaba provides a one-stop solution for application development for the distributed solutions of Alibaba middleware.
https://sca.aliyun.com
Apache License 2.0
27.91k stars 8.33k forks source link

Still invoke fail in a situation after #813 fixed #904

Open yubinnng opened 5 years ago

yubinnng commented 5 years ago

Which Component Dubbo Spring Cloud Nacos Spring Cloud Alibaba Version:2.1.1.BUILD-SNAPSHOT

Describe the bug same as https://github.com/alibaba/spring-cloud-alibaba/issues/753#issuecomment-518957726 After #813, it's still error when i restart consumer and provider together logs: Failed to invoke the method ... No provider available for the service

fangjian0423 commented 5 years ago

could you provide a project that reproduces the issue?

yubinnng commented 5 years ago

could you provide a project that reproduces the issue?

This error occurred on the server,using WebHook trigger Jenkins to execute docker-compose to rebuild and restart services image, no matter how long I wait, some consumer service still report the error. This is log:

dubbo error

But when i test a little demo project, the error doesn't occurred, maybe the demo project has not enough services to trigger the error.

tyq0010 commented 5 years ago

me too !!

Nacos Server Version: 1.1.3 Dubbo Version: 2.7.3 Spring Cloud Alibaba Version:2.1.1.BUILD-SNAPSHOT

The official examples can reproduces the issue. spring-cloud-alibaba-dubbo-examples spring-cloud-dubbo-client-sample spring-cloud-dubbo-server-sample

dubbo: consumer: check: false

Before #813 : Starting the consumer before starting the provider don't work! After #813 : Starting the consumer before starting the provider works fine, but starting consumer and provider together don't work! (Start provider immediately after starting consumer)

logs: Failed to invoke the method ... No provider available for the service...

Once this happens, no matter how long I wait, the consumer still report the error.

fangjian0423 commented 5 years ago

@tyq0010 Are you sure the problems with 2.1.1.BUILD-SNAPSHOT version?

If so, pls provide a project that reproduces the issue.

tyq0010 commented 5 years ago

Yes, I sure. The official examples can reproduces the issue:

  1. Switch to greenwich branch.
  2. Modify the "spring-cloud-alibaba-examples/spring-cloud-alibaba-dubbo-examples/spring-cloud-dubbo-client-sample/src/main/resources/bootstrap.yaml" file, add "dubbo.consumer.check=false"
  3. Start the local nacos server (v1.1.3).
  4. Start the consumer (spring-cloud-dubbo-client-sample->DubboSpringCloudClientBootstrap.java)
  5. Start the provider (spring-cloud-dubbo-server-sample->DubboSpringCloudServerBootstrap.java) immediately (As soon as possible , typically, less than 5 seconds )
  6. Open your browser and visit "http://127.0.0.1:8080/echo?message=dubbo"

Start provider after consumer have started, no problem. Start consumer immediately after starting provider, no problem. Start provider immediately after starting consumer, go wrong.

fangjian0423 commented 5 years ago

Start provider immediately after starting consumer, go wrong.

Did you remove the ApplicationRunner bean in provider project? If exception throw in ApplicationRunner, application run failed.

I removed it and it works well in my local.

tyq0010 commented 5 years ago

@fangjian0423 My code is different from yours, my provider does not have ApplicationRunner. And I've tried many times.

The Provider :

@EnableDiscoveryClient
@EnableAutoConfiguration
public class DubboSpringCloudServerBootstrap {

    public static void main(String[] args) {
        SpringApplication.run(DubboSpringCloudServerBootstrap.class);
    }
}

@Service
class EchoServiceImpl implements EchoService {

    @Override
    public String echo(String message) {
        return "[echo] Hello, " + message;
    }
}

The Consumer:

@EnableDiscoveryClient
@EnableAutoConfiguration
@RestController
public class DubboSpringCloudClientBootstrap {

    @Reference
    private EchoService echoService;

    @GetMapping("/echo")
    public String echo(String message) {
        return echoService.echo(message);
    }

    public static void main(String[] args) {
        SpringApplication.run(DubboSpringCloudClientBootstrap.class);
    }
}
fangjian0423 commented 5 years ago

oh, but why you said the official examples can reproduces the issue 😢

Any exception logs about the problem?

tyq0010 commented 5 years ago

@fangjian0423

Isn't it official examples ??

https://github.com/alibaba/spring-cloud-alibaba/blob/greenwich/spring-cloud-alibaba-examples/spring-cloud-alibaba-dubbo-examples/spring-cloud-dubbo-server-sample/src/main/java/com/alibaba/cloud/dubbo/bootstrap/DubboSpringCloudServerBootstrap.java

https://github.com/alibaba/spring-cloud-alibaba/blob/master/spring-cloud-alibaba-examples/spring-cloud-alibaba-dubbo-examples/spring-cloud-dubbo-client-sample/src/main/java/com/alibaba/cloud/dubbo/bootstrap/DubboSpringCloudClientBootstrap.java

fangjian0423 commented 5 years ago

sorry, don't see it clearly.

spring-cloud-dubbo-client-sample and spring-cloud-dubbo-server-sample project use 0.9.0.RELEASE version, refer https://github.com/alibaba/spring-cloud-alibaba/blob/master/spring-cloud-alibaba-examples/spring-cloud-alibaba-dubbo-examples/spring-cloud-dubbo-server-sample/pom.xml#L25.

pls make sure which version you are using.

tyq0010 commented 5 years ago

@fangjian0423 Sorry, I didn't see it clearly

I just updated the version and tried many times. I can still reproduce the problem. But things are much better than before.

the spring-cloud-dubbo-client-sample pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <parent>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-build</artifactId>
        <version>2.1.3.RELEASE</version>
        <relativePath/>
    </parent>
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.alibaba.cloud</groupId>
    <artifactId>spring-cloud-dubbo-client-sample</artifactId>
    <name>Spring Cloud Dubbo Client Sample</name>
    <version>2.1.1.BUILD-SNAPSHOT</version>

    <dependencyManagement>
        <dependencies>
            <!-- Spring Cloud Alibaba dependencies -->
            <dependency>
                <groupId>com.alibaba.cloud</groupId>
                <artifactId>spring-cloud-alibaba-dependencies</artifactId>
                <version>${project.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>
        <!-- Sample API -->
        <dependency>
            <groupId>com.alibaba.cloud</groupId>
            <artifactId>spring-cloud-dubbo-sample-api</artifactId>
            <version>${project.version}</version>
        </dependency>

        <!-- Spring Boot dependencies -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-actuator</artifactId>
        </dependency>

        <!-- Dubbo Spring Cloud Starter -->
        <dependency>
            <groupId>com.alibaba.cloud</groupId>
            <artifactId>spring-cloud-starter-dubbo</artifactId>
        </dependency>

        <dependency>
            <groupId>org.apache.dubbo</groupId>
            <artifactId>dubbo-spring-boot-starter</artifactId>
            <version>2.7.3</version>
        </dependency>

        <!-- Spring Cloud Nacos Service Discovery -->
        <dependency>
            <groupId>com.alibaba.cloud</groupId>
            <artifactId>spring-cloud-starter-alibaba-nacos-discovery</artifactId>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>
</project>

the spring-cloud-dubbo-server-sample pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <parent>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-build</artifactId>
        <version>2.1.3.RELEASE</version>
        <relativePath/>
    </parent>

    <modelVersion>4.0.0</modelVersion>

    <groupId>com.alibaba.cloud</groupId>
    <artifactId>spring-cloud-dubbo-server-sample</artifactId>
    <name>Spring Cloud Dubbo Server Sample</name>
    <version>2.1.1.BUILD-SNAPSHOT</version>

    <dependencyManagement>
        <dependencies>
            <!-- Spring Cloud Alibaba dependencies -->
            <dependency>
                <groupId>com.alibaba.cloud</groupId>
                <artifactId>spring-cloud-alibaba-dependencies</artifactId>
                <version>${project.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>

        <!-- Sample API -->
        <dependency>
            <groupId>com.alibaba.cloud</groupId>
            <artifactId>spring-cloud-dubbo-sample-api</artifactId>
            <version>${project.version}</version>
        </dependency>

        <!-- Spring Boot dependencies -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-actuator</artifactId>
        </dependency>

        <!-- Dubbo Spring Cloud Starter -->
        <dependency>
            <groupId>com.alibaba.cloud</groupId>
            <artifactId>spring-cloud-starter-dubbo</artifactId>
        </dependency>

        <dependency>
            <groupId>org.apache.dubbo</groupId>
            <artifactId>dubbo-spring-boot-starter</artifactId>
            <version>2.7.3</version>
        </dependency>

        <!-- Spring Cloud Nacos Service Discovery -->
        <dependency>
            <groupId>com.alibaba.cloud</groupId>
            <artifactId>spring-cloud-starter-alibaba-nacos-discovery</artifactId>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>
</project>

When the consumer logs like following during starting, it go wrong. (it is harder to reproduces, but if you tried many times, It will happen)

[DUBBO] Failed to start NettyClient PC-20160531OMSO/192.168.1.210 connect to the server /192.168.1.210:20881 (check == false, ignore and retry later!), cause: client(url: dubbo://192.168.1.210:20881/com.alibaba.cloud.dubbo.service.DubboMetadataService?anyhost=true&application=spring-cloud-alibaba-dubbo-client&bind.ip=192.168.1.210&bind.port=20881&check=false&codec=dubbo&deprecated=false&dubbo=2.0.2&dynamic=true&generic=true&group=spring-cloud-alibaba-dubbo-server&heartbeat=5000&interface=com.alibaba.cloud.dubbo.service.DubboMetadataService&lazy=false&methods=getAllServiceKeys,getServiceRestMetadata,getExportedURLs,getAllExportedURLs&pid=8736&qos.enable=false&register=true&register.ip=192.168.1.210&release=2.7.3&remote.application=spring-cloud-alibaba-dubbo-server&revision=1.0.0&side=consumer&sticky=false&timestamp=1569556742521&version=1.0.0) failed to connect to server /192.168.1.210:20881, error message is:Connection refused: no further information: /192.168.1.210:20881, dubbo version: 2.7.3, current host: 192.168.1.210

org.apache.dubbo.remoting.RemotingException: client(url: dubbo://192.168.1.210:20881/com.alibaba.cloud.dubbo.service.DubboMetadataService?anyhost=true&application=spring-cloud-alibaba-dubbo-client&bind.ip=192.168.1.210&bind.port=20881&check=false&codec=dubbo&deprecated=false&dubbo=2.0.2&dynamic=true&generic=true&group=spring-cloud-alibaba-dubbo-server&heartbeat=5000&interface=com.alibaba.cloud.dubbo.service.DubboMetadataService&lazy=false&methods=getAllServiceKeys,getServiceRestMetadata,getExportedURLs,getAllExportedURLs&pid=8736&qos.enable=false&register=true&register.ip=192.168.1.210&release=2.7.3&remote.application=spring-cloud-alibaba-dubbo-server&revision=1.0.0&side=consumer&sticky=false&timestamp=1569556742521&version=1.0.0) failed to connect to server /192.168.1.210:20881, error message is:Connection refused: no further information: /192.168.1.210:20881
    at org.apache.dubbo.remoting.transport.netty4.NettyClient.doConnect(NettyClient.java:166) [dubbo-2.7.3.jar:2.7.3]

After both the consumer and the provider are up, visit "http://127.0.0.1:8080/echo?message=dubbo" consumer report error. (...No provider available...)

This seems to have something to do with caching. Because I change the provider's port every time. The port number in the consumer log is the last provider's , if current started provider's port is 20882, and last provider's port is 20881, the current consumer log will show like this at startup:

Failed to start NettyClient PC-20160531OMSO/192.168.1.210 connect to the server /192.168.1.210:20881). 

and after consumer started, the log will show like this at intervals:

header.ReconnectTimerTask    :  [DUBBO] Fail to connect to HeaderExchangeClient [channel=org.apache.dubbo.remoting.transport.netty4.NettyClient [192.168.1.210:0 -> /192.168.1.210:20881]]

In order to avoid being affected by the heartbeat, both the consumer and the provider have stopped for several minutes at the beginning. (but this is not necessary.)

tyq0010 commented 5 years ago

I think it has something to do with this problem #936

fangjian0423 commented 5 years ago

@tyq0010 thank you for your report. we will try do verify and fix it soon.

fangjian0423 commented 5 years ago

hi @tyq0010 , i just fix it.

You could try it if you have time.

tyq0010 commented 5 years ago

@fangjian0423 Perfect! It's completely normal now! 👍

I debugged the code before and after the fix, respectively. The following code for clearing the cache takes effect.

//AbstractSpringCloudRegistry.java
protected void subscribeDubboServiceURL (...) {
    //...
    repository.removeMetadata(serviceName);
    dubboGenericServiceFactory.destroy(serviceName);
    //...
}
//DubboGenericServiceFactory.java
public synchronized void destroy(String serviceName) {
    //...
    ReferenceBean<GenericService> referenceBean = cache.remove(key);
    referenceBean.destroy();
    //...
}
fangjian0423 commented 5 years ago

Perfect! It's completely normal now! 👍 I debugged the code before and after the fix, respectively. The following code for clearing the cache takes effect.

Thank you for the report, looking forward to give us more suggestions.

zmapleshine commented 4 years ago

the 2.2.1.RELEASE version I tested still has this problem ...