Azure / azure-sdk-for-js

This repository is for active development of the Azure SDK for JavaScript (NodeJS & Browser). For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/javascript/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-js.
MIT License
2.09k stars 1.2k forks source link

Cosmos DB Node JS client fails to connect when working behind corporate proxy #6273

Closed Pinif closed 4 years ago

Pinif commented 4 years ago

Describe the bug When trying to connect to Cosmos DB from a computer behind corporate proxy, the connection fails. The error happens only from Cosmos DB NodeJS client. Using Cosmos DB Python client works as expected. The error itself happens internally inside the Node JS client code, and is caught inside without any response to the calling code.

The specific error is (can be seen only via debugging the client code): "Error: tunneling socket could not be established, cause=socket hang up"

It appears that the code inside Cosmos DB NodeJS client (the function createRequestObject, inside request.ts) does not use the proxy configuration passed to it in the connection policy object. This is the root cause of the problem.

I'm adding here 2 code snippets, the Node JS (not working) client and the Python (working) client

Node JS client code (not working) let connectionPolicy = { ProxyUrl: 'https://<proxy>:8080', RequestTimeout: 1000 , DisableSSLVerification : true }; const cosmosClient = new CosmosClient({ endpoint: "https://<target-cosmosdb>:443/", auth: { masterKey: "<master-key-value>}, connectionPolicy : connectionPolicy}); const { resource: databaseDefinition } = await this.client.database("<db-name>").read();

Python client code (working) import azure.cosmos.cosmos_client as CosmosClient import azure.cosmos.documents as documents connection_policy = documents.ConnectionPolicy() connection_policy.DisableSSLVerification = True connection_policy.RequestTimeout = 1000 connection_policy.ProxyConfiguration = documents.ProxyConfiguration() connection_policy.ProxyConfiguration.Host = 'https://proxy' connection_policy.ProxyConfiguration.Port = 8080 client = CosmosClient.CosmosClient('https://:443/', { 'masterKey' : ''}, connection_policy)

database_link = 'dbs/' database = client.ReadDatabase(database_link)

To Reproduce Steps to reproduce the behavior:

  1. Create a Node JS Cosmos DB client according to the code snippet above
  2. Run the code when behind a corporate proxy

Expected behavior Connection should be opened successfully and the requested database should be read

Screenshots N\A

Additional context N\A

Pinif commented 4 years ago

Attaching the Python code snippet as the markup "teared" some of the code parts above:

test_cosmos_db.txt

Pinif commented 4 years ago

The require directive I used for running the Node JS code snippet above is:

const CosmosClient = require("@azure/cosmos").CosmosClient;

ghost commented 4 years ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @kushagraThapar @srinathnarayanan @southpolesteve @shurd

ghost commented 4 years ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @kushagraThapar @srinathnarayanan @southpolesteve @shurd

southpolesteve commented 4 years ago

@Pinif Are you able to upgrade to the latest version of the SDK? In version 3.x.x we allow passing a custom agent to configure the proxy which we found to be more reliable for nodeJS

Pinif commented 4 years ago

@Pinif Are you able to upgrade to the latest version of the SDK? In version 3.x.x we allow passing a custom agent to configure the proxy which we found to be more reliable for nodeJS

I upgraded to azure cosmos client version 3.4.2. Still getting the same problem -- the client hangs when trying to perform some request (createIfNotExist, read, etc...)

southpolesteve commented 4 years ago

@Pinif Can you paste the code you are using? Can you also try running the example code from proxy-agent: https://github.com/TooTallNate/node-proxy-agent#example and see if the issue persists?

Trying to determine if the problem is specific to the cosmos SDK or a more general node issue.

Pinif commented 4 years ago

@Pinif Can you paste the code you are using? Can you also try running the example code from proxy-agent: https://github.com/TooTallNate/node-proxy-agent#example and see if the issue persists?

Trying to determine if the problem is specific to the cosmos SDK or a more general node issue.

The issue persists when using proxy-agent or https-proxy-agent (tried them both). My code with proxy-agent:

var ProxyAgent = require('proxy-agent'); var proxyUri = 'https://genproxy.amdocs.com:8080'; let proxyAgent2 = new ProxyAgent(proxyUri);

const cosmosClient = new CosmosClient({ agent: proxyAgent2, endpoint: ..., auth: { masterKey: ... }});

const dbResponse = cosmosClient.databases.createIfNotExist( { id : ...} )

The problem is specific to Cosmos DB Node JS SDK. If you look at my initial comment for this issue, i started out with the same code on the same laptop, different SDKs (I used only the connection policy argument, without agent).

Azure Python Cosmos SDK worked without a problem. Azure Node JS Cosmos SDK gives socket hang error.

So it is a problem specific to Cosmos Node JS SDK, not a general node issue.

southpolesteve commented 4 years ago

@Pinif I tried to reproduce the issue using a local node proxy server and everything worked as expected:

const { CosmosClient } = require('@azure/cosmos')
const ProxyAgent = require('proxy-agent')
const proxyEndpoint = 'http://localhost:8080'
const agent = new ProxyAgent(proxyEndpoint)

const endpoint = '...'
const key = '...'

const client = new CosmosClient({
  agent,
  endpoint,
  key
})

client.databases
  .readAll()
  .fetchAll()
  .then(response => {
    console.log(response.resources)
  })

Do you have a mechanism for debugging or logging from inside your proxy server? I would still like to have you run the proxy agent example code to verify there is no issue with your proxy or network. Assuming your proxy lets you talk to the public internet:

const https = require('https')
const ProxyAgent = require('proxy-agent')
const url = 'https://genproxy.amdocs.com:8080'
const opts = {
  method: 'GET',
  host: 'microsoft.com',
  path: '/',

  agent: new ProxyAgent(url)
}

http.get(opts, res => {
  console.log(res.statusCode, res.headers)
  res.pipe(process.stdout)
})
Pinif commented 4 years ago

@southpolesteve I tried the code you pasted regarding connecting to microsoft.com via our proxy, and it didn't work (getting connection reset). Also there was a small typo there (should be https.get instead of http.get :-)).

However, the following 3 code parts DO work ok (using request, using http and using axios) and connect to microsoft.com through Amdocs proxy, so there is no issue neither with the proxy nor with our network:

    /* Option 1 */

    const http = require("http");

    const options = {
        host: "genproxy.amdocs.com",
        port: 8080,
        path: "https://www.microsoft.com"
    };

    http.get(options, function(res) {
        console.log(res);
        res.pipe(process.stdout);
    });

    /* Option 2 */

    var request = require('request');

    request({
        'url':'https://www.microsoft.com',
        'method': "GET",
        'proxy':'http://genproxy.amdocs.com:8080'
    },function (error, response, body) {
        if (!error && response.statusCode == 200) {
            console.log(body);
        }
    })

    /* Option 3 */

    var axios = require('axios');
    const HttpsProxyAgent = require("https-proxy-agent")

    const agent = new HttpsProxyAgent({host: "genproxy.amdocs.com", port: "8080" })
    axios = axios.create({
        httpsAgent: agent
    });

    axios.get("https://www.microsoft.com")
        .then(response => {
            let res = response.data;
            console.log(res)
        })
        .catch(err => console.log(err));
Pinif commented 4 years ago

@southpolesteve I assume there are some differences between a local node proxy server and a real corporate proxy server...

It is possible that the Cosmos DB Node JS client was tested and works vs. a node proxy server, but does not work vs. corporate proxy (which explains why the specific code snippet you sent me does not work, but other libraries do work OK with Amdocs proxy and get to microsoft.com...)

southpolesteve commented 4 years ago

@Pinif Without a way to reproduce the issue, I am not sure how else I can help here. We do have other customers using us via their corporate proxies. I would recommend looking into any methods of debugging from the proxy side of the connection. You can also run the script calling Cosmos with NODE_DEBUG environment variables set to "https,http". That will provide the full debug logs for nodes http stack and may be helpful.

To clarify, there is no custom SDK behavior around connection behavior. We pass the user-provided agent directly to node-fetch with modification.

Pinif commented 4 years ago

To clarify, there is no custom SDK behavior around connection behavior. We pass the user-provided agent directly to node-fetch with modification.

Not exactly. Your code is not using the connection policy nor the agent settings when sending the request from within Cosmos DB.

If you look at your code under @azure/cosmos/src/request/request.ts, the function createRequestObject:

/* @hidden / export function createRequestObject( connectionPolicy: ConnectionPolicy, requestOptions: https.RequestOptions, body: Buffer ): Promise<Response> { return new Promise<Response>((resolve, reject) => { function onTimeout() { httpsRequest.abort(); }

const isMedia = requestOptions.path.indexOf("//media") === 0;

const httpsRequest: ClientRequest = https.request(requestOptions, (response: ClientResponse) => {
  // In case of media response, return the stream to the user and the user will need
  // to handle reading the stream.
  if (isMedia && connectionPolicy.MediaReadMode === MediaReadMode.Streamed) {
    return resolve({
      result: response,
      headers: response.headers as IHeaders
    });
  }

  let data = "";

  // if the requested data is text (not attachment/media) set the encoding to UTF-8
  if (!isMedia) {
    response.setEncoding("utf8");
  }

  response.on("data", chunk => {
    data += chunk;
  });
  response.on("end", () => {
    if (response.statusCode >= 400) {
      return reject(getErrorBody(response, data, response.headers as IHeaders));
    }

    let result;
    try {
      result = isMedia ? data : data.length > 0 ? JSON.parse(data) : undefined;
    } catch (exception) {
      return reject(exception);
    }

    resolve({ result, headers: response.headers as IHeaders, statusCode: response.statusCode });
  });
});

httpsRequest.once("socket", (socket: Socket) => {
  if (isMedia) {
    socket.setTimeout(connectionPolicy.MediaRequestTimeout);
  } else {
    socket.setTimeout(connectionPolicy.RequestTimeout);
  }

  socket.once("timeout", onTimeout);

  httpsRequest.once("response", () => {
    socket.removeListener("timeout", onTimeout);
  });
});

httpsRequest.once("error", reject);

if (body) {
  httpsRequest.write(body);
  httpsRequest.end();
} else {
  httpsRequest.end();
}

}); }

Pinif commented 4 years ago

(This is from Cosmos client version 2.1.7) The function gets the connection policy object (with the proxy details set inside it), and does not do anything with it.

This causes the request to be sent incorrectly, from what I saw...

Pinif commented 4 years ago

Node debug log when using agent settings:

(node:18556) Warning: Setting the NODE_DEBUG environment variable to 'http' can expose sensitive data (such as passwords, tokens and authenticati on headers) in the resulting log. Server listening on port: 3001 HTTP 18556: outgoing message end. Waiting for the debugger to disconnect...

Pinif commented 4 years ago

Node debug log when using connection policy settings (reminder: this works OK when using Python Azure Cosmos DB client):

HTTP 11800: call onSocket 0 0 HTTP 11800: createConnection genproxy.amdocs.com:8080:::::::::::::::::: { host: 'genproxy.amdocs.com', port: 8080, headers: {}, method: 'CONNECT', path: null, agent: false, secureEndpoint: true, _defaultAgent: Agent { _events: [Object: null prototype] { free: [Function] }, _eventsCount: 1, _maxListeners: undefined, defaultPort: 443, protocol: 'https:', options: { path: null }, requests: {}, sockets: {}, freeSockets: {}, keepAliveMsecs: 1000, keepAlive: false, maxSockets: Infinity, maxFreeSockets: 256, maxCachedSessions: 100, _sessionCache: { map: {}, list: [] } }, servername: 'genproxy.amdocs.com', _agentKey: 'genproxy.amdocs.com:8080::::::::::::::::::' } HTTPS 11800: createConnection { host: 'genproxy.amdocs.com', port: 8080, headers: {}, method: 'CONNECT', path: null, agent: false, secureEndpoint: true, _defaultAgent: Agent { _events: [Object: null prototype] { free: [Function] }, _eventsCount: 1, _maxListeners: undefined, defaultPort: 443, protocol: 'https:', options: { path: null }, requests: {}, sockets: {}, freeSockets: {}, keepAliveMsecs: 1000, keepAlive: false, maxSockets: Infinity, maxFreeSockets: 256, maxCachedSessions: 100, _sessionCache: { map: {}, list: [] } }, servername: 'genproxy.amdocs.com', _agentKey: 'genproxy.amdocs.com:8080::::::::::::::::::', encoding: null } HTTP 11800: sockets genproxy.amdocs.com:8080:::::::::::::::::: 1 HTTP 11800: outgoing message end. HTTP 11800: outgoing message end.

Pinif commented 4 years ago

Here's another potential bug I found in an earlier stage, this time in the CosmosClient constructor itself (from @azure/cosmos/src/CosmosClient.ts):

The code checks for the following setting (see bolded below):

constructor(private options: CosmosClientOptions) { options.auth = options.auth || {}; if (options.key) { options.auth.key = options.key; }

options.connectionPolicy = Helper.parseConnectionPolicy(options.connectionPolicy);

options.defaultHeaders = options.defaultHeaders || {};
options.defaultHeaders[Constants.HttpHeaders.CacheControl] = "no-cache";
options.defaultHeaders[Constants.HttpHeaders.Version] = Constants.CurrentVersion;
if (options.consistencyLevel !== undefined) {
  options.defaultHeaders[Constants.HttpHeaders.ConsistencyLevel] = options.consistencyLevel;
}

const platformDefaultHeaders = Platform.getPlatformDefaultHeaders() || {};
for (const platformDefaultHeader of Object.keys(platformDefaultHeaders)) {
  options.defaultHeaders[platformDefaultHeader] = platformDefaultHeaders[platformDefaultHeader];
}

options.defaultHeaders[Constants.HttpHeaders.UserAgent] = Platform.getUserAgent();

**if (!this.options.agent) {**
  // Initialize request agent
  const requestAgentOptions: AgentOptions & tunnel.HttpsOverHttpsOptions & tunnel.HttpsOverHttpOptions = {
    keepAlive: true
  };
  if (!!this.options.connectionPolicy.ProxyUrl) {
    const proxyUrl = url.parse(this.options.connectionPolicy.ProxyUrl);
    const port = parseInt(proxyUrl.port, 10);
    requestAgentOptions.proxy = {
      host: proxyUrl.hostname,
      port,
      headers: {}
    };

    if (!!proxyUrl.auth) {
      requestAgentOptions.proxy.proxyAuth = proxyUrl.auth;
    }

    this.options.agent =
      proxyUrl.protocol.toLowerCase() === "https:"
        ? tunnel.httpsOverHttps(requestAgentOptions)
        : tunnel.httpsOverHttp(requestAgentOptions); // TODO: type coersion
  } else {
    this.options.agent = new Agent(requestAgentOptions); // TODO: Move to request?
  }
}

const globalEndpointManager = new GlobalEndpointManager(this.options, async (opts: RequestOptions) =>
  this.getDatabaseAccount(opts)
);
this.clientContext = new ClientContext(options, globalEndpointManager);

this.databases = new Databases(this, this.clientContext);
this.offers = new Offers(this, this.clientContext);

}

Pinif commented 4 years ago

Note the line "if (!this.options.agent)" in your code above... It will initialize the related agent settings only in case the agent is NOT set in the options I pass, which is incorrect. So when i set agent in the options i pass when doing new CosmosClient, this initialization code is skipped.

You should omit the "!" from this line.

southpolesteve commented 4 years ago

Since you said above you have the same issue in v3, I want to remain focused on debugging that issue. There are fewer parts involved and any issue there will likely still happen in v2 even if other v2 bugs exist. The v2 code definitely has some complexity which is why we redid it in v3.

Can you confirm again the problem persists in v3?

Looking back through your original code, the only difference I can see vs the v3+proxy-agent approach is disabling SSL verification and enabling keep alive on the agent. Can you try doing that in v3 of the SDK? Code below. If it still fails, can you also post the NODE_DEBUG output? It can't hurt to have that info as well.

New v3+proxy-agent code:

const url = require("url");
const { CosmosClient } = require("@azure/cosmos");
const ProxyAgent = require("proxy-agent");
const options = url.parse("http://genproxy.amdocs.com:8080");
opts.rejectUnauthorized = false;
opts.keepAlive = true;

const agent = new ProxyAgent(options);

const endpoint = "...";
const key = "...";

const client = new CosmosClient({
  agent,
  endpoint,
  key
});

client.databases
  .readAll()
  .fetchAll()
  .then((response) => {
    console.log(response.resources);
  });
Pinif commented 4 years ago

Ok, now with v3 I got a local certificate error, but the following settings has resolved it:

process.env["NODE_TLS_REJECT_UNAUTHORIZED"] = 0;

So now from my point of view the issue is closed.

ghost commented 4 years ago

Thanks for working with Microsoft on GitHub! Tell us how you feel about your experience using the reactions on this comment.