networknt / light-4j

A fast, lightweight and more productive microservices framework
Apache License 2.0
3.61k stars 633 forks source link

Light-4J (2.1.21 to 2.1.37) possible bugs on LightCluster & AbstractRegistry classes #2356

Open yapkm01 opened 1 week ago

yapkm01 commented 1 week ago

Hi, I am getting the below error.

20241009_232820

My service.yml is as follows:

20241009_230717

The possible bug is as follows (as outlined under clause):

  private List<URL> discovery(String protocol, String serviceId, String tag) {
        if(logger.isDebugEnabled()) logger.debug("protocol = " + protocol + " serviceId = " + serviceId + " tag = " + tag);
        URL subscribeUrl = URLImpl.valueOf(protocol + "://localhost/" + serviceId);
        if(tag != null) {
            subscribeUrl.addParameter(Constants.TAG_ENVIRONMENT, tag);
        }
        if(logger.isDebugEnabled()) logger.debug("subscribeUrl = " + subscribeUrl);
        // subscribe is async and the result won't come back immediately.
        registry.subscribe(subscribeUrl, null); // clause 1 -- passing null as a 2nd parameter
        // do a lookup for the quick response from either cache or registry service.
        List<URL> urls = registry.discover(subscribeUrl);
        if(logger.isDebugEnabled()) logger.debug("discovered urls = " + urls);
        return urls;
    }

This will call AbstractRegistry.subscribe method:

@Override
    public void subscribe(URL url, NotifyListener listener) { // clause 2 -- listener will be null here.
        if (url == null) {
            logger.warn("[{}] subscribe with malformed param, url:{}", registryClassName, url);
            return;
        }
        if(logger.isInfoEnabled()) logger.info("[{}] Listener ({}) will subscribe to url ({}) in Registry [{}]",
                registryClassName, listener, url, registryUrl.getIdentity());
        doSubscribe(url.createCopy(), listener); // clause 3 -- will call this
    }

This will call ConsulRegistry.doSubscribe method

 @Override
    protected void doSubscribe(URL url, final NotifyListener listener) {
        // you only need to subscribe once.
        if(!subscribedSet.contains(url)) {
            addNotifyListener(url, listener); // clause 4 - will call this
            startListenerThreadIfNewService(url);
            subscribedSet.add(url);
        }
    }

This will call the ConsulRegistry.addNotifyListener method which is where the issue is:

    private void addNotifyListener(URL url, NotifyListener listener) {
        String service = ConsulUtils.getUrlClusterInfo(url);
        ConcurrentHashMap<URL, NotifyListener> map = notifyListeners.get(service);  // clause 5 
        if (map == null) { // clause 6
            notifyListeners.putIfAbsent(service, new ConcurrentHashMap<>()); // clause 7
            map = notifyListeners.get(service); // clause 8
        }
        synchronized (map) { // clause 9
            map.put(url, listener); // clause 10
        }
    }

This looks like a bug. Supposedly this method is invoked 1st time.

  1. There's value in the url, hence there is service.
  2. Clause 5 will return null since it's the 1st time. Meaning map == null.
  3. Clause 6 will be true.
  4. Clause 7 will insert a service and an empty container into notifyListeners object.
  5. Clause 8 will return {} which is empty container created by clause 7
  6. Clause 9 will go through.
  7. Clause 10 will fail since listener is null (passed from clause 1).

Can you help? Maybe it's something i missed but i just don't see any other thing other than a bug. This same code exists from version 2.0.21 - 2.1.37.

mihai-vladuc commented 15 hours ago

@stevehu Hi Steve, yapkm01 is part of our organization, please help resolving this issue. Thank You