watson-developer-cloud / node-sdk

:comet: Node.js library to access IBM Watson services.
https://www.npmjs.com/package/ibm-watson
Apache License 2.0
1.48k stars 669 forks source link

[discovery] add support for uploading an in memory json structure #397

Closed robertyates closed 6 years ago

robertyates commented 7 years ago

At the moment it is not possible to upload an in memory json buffer via the api.

This code

var program = require('commander');
var DiscoveryV1 = require('watson-developer-cloud/discovery/v1');
var fs = require('fs');
var readline = require('readline');
var stream = require('stream');

program
  .arguments('<file>')
  .option('-u, --username <username>', 'The username needed for the Discovery Service')
  .option('-p, --password <password>', 'The password needed for the Discovery Service')
  .option('-e, --environment_id <environment_id>', 'The environment id for the Discovery Service')
  .option('-c, --collection_id <collection_id>', 'The collection id for the Discovery Service')
  .action(function(file) {
    console.log('user: %s pass: %s file: %s', program.username, program.password, file);

    var discovery = new DiscoveryV1({
      username: program.username,
      password: program.password,
      version_date: DiscoveryV1.VERSION_DATE_2016_12_15
    });

    var instream = fs.createReadStream(file);
    var outstream = new stream;

    readline.createInterface(instream,outstream)
      .on('line', function(line) {
        console.log('calling discovery with %s',line.toString());
        discovery.addDocument({
          environment_id:  program.environment_id,
          collection_id: program.collection_id,
          file: line
          },
          function(err,response){
            if (err) {
              console.log(err);
            } else {
              console.log(JSON.stringify(response, null, 2));
            }
          }
        );
      })
  })

produces the following error

{ Error: The Media Type [application/octet-stream] of the input document is not supported. Auto correction was attempted, but the auto detected media type [text/plain] is also not supported. Supported Media Types are: application/json, application/msword, application/vnd.openxmlformats-officedocument.wordprocessingml.document, application/pdf, text/html, application/xhtml+xml .
    at Request._callback (/Users/robyates/elite/postfiles/node_modules/watson-developer-cloud/lib/requestwrapper.js:77:15)
    at Request.self.callback (/Users/robyates/elite/postfiles/node_modules/request/request.js:186:22)
    at emitTwo (events.js:106:13)
    at Request.emit (events.js:191:7)
    at Request.<anonymous> (/Users/robyates/elite/postfiles/node_modules/request/request.js:1081:10)
    at emitOne (events.js:96:13)
    at Request.emit (events.js:188:7)
    at Gunzip.<anonymous> (/Users/robyates/elite/postfiles/node_modules/request/request.js:1001:12)
    at Gunzip.g (events.js:291:16)
    at emitNone (events.js:91:20)
    at Gunzip.emit (events.js:185:7)
  code: 415,

it may be possible to pass the content-type in the metadata field but if it is I could not fathom the correct incarnation, and it is not documented.

The file being used used is of type jsonlines http://jsonlines.org/

germanattanasio commented 7 years ago

@nfriedly can you take a look at this?

adrien2p commented 7 years ago

that could come from the fact the api is waiting for one of the below content-type

application/json
application/msword
application/vnd.openxmlformats-officedocument.wordprocessingml.document
application/pdf
text/html
application/xhtml+xml

Maybe the readableStream could be accepted, but we must transform it before into one of these content types accepted by the API.

nfriedly commented 7 years ago

@robertyates

Try changing file: line to

  file: {
    value:  line,
    options: {
      filename: 'line.json',
      contentType: 'application/json'
    }
  }

It's a bit of a hack, but that's the format that request uses to define the contentType and such of a file, and we use request under the hood, so I think it will work. (We might switch to another library eventually, but I'll keep this syntax and/or provide something better. And it'd be a major version change either way.)