lenchv / hive-driver

Driver for connection to Apache Hive via Thrift API
MIT License
40 stars 9 forks source link

connect hive database using node js #66

Open ArunKarthick-pss opened 2 weeks ago

ArunKarthick-pss commented 2 weeks ago

I have tired to connect hive database from node js using hive-driver npm package.But the session can't be established.This is the major issue for this package.my node version is 16.5.0 and my npm version is 8.5.5. If any other solution is there,please let me know

lenchv commented 2 weeks ago

@ArunKarthick-pss could you share an error and the code that reproduces it? Also, did you manage to connect using other clients such as "beeline" from that computer? Are you sure that host is accessible from the computer?

ArunKarthick-pss commented 2 weeks ago

@lenchv,this is my code,Here I put log for both client and session variable.The client variable returns circular json response and session variable does not return anything.so please help me on this.

const hive = require('hive-driver'); const { TCLIService, TCLIService_types } = hive.thrift; const client = new hive.HiveClient( TCLIService, TCLIService_types );

client.connect( { host : "xxxxx" , port : "xxx", database: "xxxx", username :"xxxx",

        password: "xxxxx",
},
new hive.connections.TcpConnection(),
new hive.auth.NoSaslAuthentication()

).then(async client => { console.log('client',client); const session = await client.openSession({ client_protocol: TCLIService_types.TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V10 });

console.log('session',session); const response = await session.getInfo( TCLIService_types.TGetInfoType.CLI_DBMS_VER );

console.log(response.getValue());

await session.close();

}).catch(error => { console.log(error); });

lenchv commented 2 weeks ago

@ArunKarthick-pss I see, I get the same behavior with your code if I try to connect to the Hive Server with LDAP authentication strategy. Could you change your code the following way:

const hive = require("hive-driver");
const { TCLIService, TCLIService_types } = hive.thrift;
const client = new hive.HiveClient(TCLIService, TCLIService_types);

client
  .connect(
    {
      host: "xxxxx",
      port: "xxx",
      database: "xxxx",
    },
    new hive.connections.TcpConnection(),
    new hive.auth.PlainTcpAuthentication({
        username: "xxxx",
        password: "xxxxx",
    })
  )
  ...

Notice, that I use PlainTcpAuthentication instead of NoSaslAuthentication

ArunKarthick-pss commented 2 weeks ago

@lenchv I tired to implement which you have given the code above.But it is not working.The below following error will be shown.

getaddrinfo ENOTFOUND xxxxxx at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:71:26) { errno: -3008, code: 'ENOTFOUND', syscall: 'getaddrinfo', hostname: 'xxxxx' }

lenchv commented 2 weeks ago

@ArunKarthick-pss do you have this error during connecting or session opening? Are you sure the hostname you are trying to connect is accessible? Check accessibility you can in different ways:

ping <hostname>
nslookup <hostname>
telnet <hostname> <port>
ArunKarthick-pss commented 1 week ago

@lenchv, the hostname accessibility is working fine,while I implement PlainTcpAuthentication,the hostname connection is not working in my code.please find the below code.

const hive = require('hive-driver'); const { TCLIService, TCLIService_types } = hive.thrift; const client = new hive.HiveClient( TCLIService, TCLIService_types ); client .connect( { host : "xxxxxx", port : "xxxx", database: "xxxxx" }, new hive.connections.TcpConnection(), new hive.auth.PlainTcpAuthentication({ username :"xxxxx" password: "xxxxxx", }), ).then(async client => {

console.log('client',client)
const session = await client.openSession({
    client_protocol: TCLIService_types.TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V10
});

console.log('session',session)
const response = await session.getInfo(
    TCLIService_types.TGetInfoType.CLI_DBMS_VER
);

console.log(response.getValue());

await session.close();

}).catch(error => { console.log(error); });

In my above code,I have put logs for client and session variable.but both the variable does not return anything.

lenchv commented 1 week ago

@ArunKarthick-pss well, I have an assumption that the issue is in lib/connection/transports/TcpTransport.ts

    connect(): any {
        this.connection = net.createConnection(this.port, this.host);
    };

The difference between NoSaslAuthentication and PlainTcpAuthentication that NoSaslAuthentication doesn't wait until connect is done, and it is why you don't have an error in it. But PlainTcpAuthentication does, and apparently connection breaks because there is something wrong with hostname resolving.

Try to figure out whether you can create a simple TCP connection:

const net = require('net');

const conn = net.createConnection(<port>, <host>);
conn.addListener('connect', () => console.log('connected'));
conn.addListener('error', (error) => console.error('error'));

See net.createConnection options, perhaps you need some specific ones.

Also, you can make sure that node js properly resolves your hostname:

const dns = require('dns');

dns.resolve(<hostname>, (err, records) => {
    err ? console.error(err) : console.log(records)
})

And if on your system "nslookup " works properly, you can try to pass directly IP address instead of hostname.

If you manage to connect with some specific options, I can update the library, or you can send a pull request.

ArunKarthick-pss commented 1 week ago

@lenchv I have tired for both net and dns checking with my hostname. It's working fine.But still the session can't be established.Please help me on this

ArunKarthick-pss commented 1 week ago

Hi @lenchv,please help on above which I have mentioned issue.

lenchv commented 1 week ago

@ArunKarthick-pss Unfortunately I cannot reproduce such behavior. I need your help to figure out what is wrong with the connection and find out how to fix it.

Try to debug step-by-step and find exactly on what lines the connection fails.

Also, try to narrow down the issue, if your are able to connect using simple snippet I shared above, then try to authenticate manually. The below code initiates TCP Plain authentication:

const net = require('net');

const HOST = '<your host>'
const PORT = '<your port>'
const USERNAME = '<your username>'
const PASSWORD = '<your password>'

const SASL_CODES = {
    START: 1,
    OK: 2,
    BAD: 3,
    ERROR: 4,
    COMPLETE: 5,
}
function createSaslPackage(status, body) {
    const bodyLength = Buffer.alloc(4);

    bodyLength.writeUInt32BE(body.length, 0);

    return Buffer.concat([ Buffer.from([ status ]), bodyLength, body ]);
}

const conn = net.createConnection(PORT, HOST);
conn.addListener('connect', () => {
    console.log('connected');
    conn.write(createSaslPackage(SASL_CODES.START, Buffer.from('PLAIN')))
    conn.write(createSaslPackage(SASL_CODES.OK, Buffer.concat([
        Buffer.from(USERNAME || ""),
        Buffer.from([0]),
        Buffer.from(USERNAME || ""),
        Buffer.from([0]),
        Buffer.from(PASSWORD || ""),
    ])))
});
conn.addListener('error', (error) => console.error(error));
conn.addListener('data', (data) => {
   if (data[0] == SASL_CODES.COMPLETE) {
       console.log('authenticated successfully')
   } else {
      console.log("Authentication failed: " + data.toString());
    }
})

Did you also try using a resolved IP address instead of hostname?

Also, what are HiveServer logs? Does it output anything?

lenchv commented 1 week ago

I easily can get the same error if I try to connect to non-existing hostname, which is definitely cannot be resolved:

Error: getaddrinfo ENOTFOUND non-existing-host.local
    at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:118:26) {
  errno: -3008,
  code: 'ENOTFOUND',
  syscall: 'getaddrinfo',
  hostname: 'non-existing-host.local'
}

Are you sure the hostname you typed is correct?

ArunKarthick-pss commented 2 days ago

@lenchv Here I have write two functions, first one is hive database connection and second one is session establishment to hive database, the connection is working fine but the session establishment function does not return anything please refer the code below.If anything I am wrong please correct it.

const thrift=require('thrift') const hive=require('hive-driver'); const{TCLIService,TCLIService_types}=hive.thrift;

//Create a Thrift client

const createHiveConnection=async()=>{ const transport=thrift.TBufferedTransport; const protocol=thrift.TBinaryProtocol;

  const connection=thrift.createConnection('xxxxxxx',10000,{
    transport,protocol
  })

  const client=thrift.createClient(TCLIService,connection)

  connection.on('error',(err)=>{
            console.log('Connection Error')
  })

  return new Promise((resolve,reject)=>{

    connection.on('connect',()=>{
        console.log('Connected to Hive');
        resolve({client,connection})

})

connection.on('error',(err)=>{ console.log('Connection Error') reject(err) })

  })

}

createHiveConnection()

const executeQuery=async()=>{ try { const{client,connection}=await createHiveConnection();

            const sessionRequest={
                username:'xxxxxxx',
                password:"xxxxxxxxx",
                client_protocol:TCLIService_types.TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V10,
                database: "xxxxxxxxx"

            }

            // console.log('client',client)
            const session=await client.OpenSession(sessionRequest)

            console.log('session',session)
    } catch (error) {
        console.log('error',error)
    }

}

executeQuery()

ArunKarthick-pss commented 2 days ago

@lenchv ,

    My host name is surely  correct