microsoft / botframework-sdk

Bot Framework provides the most comprehensive experience for building conversation applications.
MIT License
7.51k stars 2.45k forks source link

[Question] How to store attachments uploaded to bot? #3781

Closed nickgkan closed 6 years ago

nickgkan commented 7 years ago

We have configured a bot using Node.js. When we upload a file to the bot, a session.message object is created, containing a field "attachments". When we try to perform a GET request in order to store these attachments, a corrupted file is stored instead. The same code used to work about 3 weeks ago.

We use:

In more detail, we upload a file and try to store it performing a GET request (with Jwt Token if necessary). The response to this request should be the file's body with status 200. However, the response was 500 and the error body was empty. This week, we observed something different. The response status is 200 but the stored file (usually a pdf) is corrupted and seems blank. What we noticed is that its size is slightly less than the original and when we compare the original and the stored file (using Notepad++), they seem to differ only in non-ascii characters. However, this seems enough to perfectly erase everything from inside the file and leave it blank.

Reproduction Steps

  1. Upload a file (let's say a pdf file) in any of these channels: Skype, MS Teams, E-Mail, Web Chat.
  2. Perform the GET request to store it.
  3. Check the stored file (if any).

Expected Behavior

Get a stored file, identical to the one we uploaded.

Actual Results

Either an error 500 with no error message (until some days ago) or a response with status 200 but a corrupted file stored (since 13/11/2017).

nwhitmont commented 7 years ago

@nickgkan Check out this example. It shows how to receive attachments via bot messages.

https://github.com/nwhitmont/botframework-node-v3-receive-attachment

nickgkan commented 7 years ago

Dear @nwhitmont

Thank you for trying to help. I've already checked this example and this is what I said in my original post: "In case of Skype and MS Teams, we use JwtToken as Authorization header, as recommended.", I meant this example. So this is not a question about how to store attachments, but an issue, related to this https://github.com/Microsoft/BotBuilder/issues/3623#issuecomment-342160276 (not just my comment, the issue in general) and I decided to open a new issue to specify that I encounter the same 500 server error and in some cases (I tried more than one bots) the empty file, using Node.js not only on Skype but also in every other channel I mentioned: "MS Teams, E-Mail, Web Chat" in my original post. As you can see in the other issue (3623), I am not the only one who encounters this issue. And as I said in my original post, "The same code used to work about 3 weeks ago.". I hope you come back with a solution soon!

mbertrait commented 7 years ago

@nickgkan Is your issue solved ? I've experienced almost the same issue weeks ago but it's because I'ven't differenciate the Skype case (JwtToken) and Facebook one (No token required). In my case, my user uploads an image and I need to store it on a FTP server (first I store the image on a tmp folder then upload it to the FTP server and delete it from the tmp folder). My images are not corrupted, make sure to put the correct extension to the file you downloaded (.jpeg in my case).

nwhitmont commented 7 years ago

@nickgkan Can you post the code "that worked 3 weeks ago" for reference?

nickgkan commented 7 years ago

@nwhitmont Update: Same bot, two different users. When the first of them posts a file and the bot tries to store it, I get a 200 response from the server but the file is corrupted. When the second tries the same, I get a 500 response!

Take a look in the code:

module.exports = function (bot) {
    var log4js = require('log4js');
    var logger = log4js.getLogger("attachFile.js");
    var fs = require('fs');
    var request = require('request');
    var mkdirp = require('mkdirp');
    var builder = require('botbuilder');

    bot.dialog('/attachFile', [
        function(session,args,next){
            logger.debug("/attachFile/F1: Start");
            if (session.privateConversationData.JWToken===undefined){
                logger.error("/attachFile/F1: Undefined JWToken");
                session.endConversation("Unable to store your attachments. Sorry for the inconvenience, please try again.");
            }
            else {
                if (session.privateConversationData.userRequest.text.length===0) {
                    if (session.privateConversationData.userRequest.attachments.length===1){
                        var txt="I received your attachment. Please let me know how should I handle it.";
                    }
                    else{
                        var txt="I received your attachments. Please let me know how should I handle them.";
                    }
                    var msg = new builder.Message(session).textFormat('markdown').text(txt);
                    builder.Prompts.text(session,msg);
                }
                else {
                    next();
                }
            }
        },

        function(session, args, next) {
            logger.debug("/attachFile/F2: Start");
            if (!(args.response ===null) && !(args.response ===undefined)){
                session.privateConversationData.userRequest.text = args.response;
            }

            var mkdirName = process.env.BOT_FILES_INCOMING_PATH + '/' + session.privateConversationData.userAccount.userAccountId;
            mkdirp(mkdirName, function(err) {
                if (err) {
                    logger.error("/attachFile/F2: unable to create folder. Error->  "+err);
                    session.endConversation("Unable to store your attachments. Sorry for the inconvenience, please try again.");
                }
                else {
                    if (!mkdirName.endsWith('/')){
                        mkdirName=mkdirName+'/';
                    }
                    session.privateConversationData.attachmentsToWrite=session.privateConversationData.userRequest.attachments.length-1;
                    writeFileRequest(session,mkdirName);
                }
            });     
        }
    ]);

    function writeFileRequest(session,mkdirName) {

        var options = {
            url: session.privateConversationData.userRequest.attachments[session.privateConversationData.attachmentsToWrite].contentUrl,
            method: 'GET',
            headers: {'Content-type' : session.privateConversationData.userRequest.attachments[session.privateConversationData.attachmentsToWrite].contentType}
        };
        if (session.message.address.channelId === 'skype' || session.message.address.channelId === 'msteams'){
            options.headers.Authorization='Bearer ' + session.privateConversationData.JWToken;
        }

        request(options, function(err, response, body) {
            if (err){
                logger.error(err);
            }
            else{
                console.log(response.statusCode);

                var fileName = session.privateConversationData.userRequest.attachments[session.privateConversationData.attachmentsToWrite].name;
                if (fs.existsSync(mkdirName + fileName)) {
                    var fileType=fileName.substr(fileName.lastIndexOf('.')); //e.g. '.pdf'
                    var fileSubName=fileName.substr(0, fileName.length-fileType.length); //'name' if original fileName is 'name.pdf'
                    var j=1;
                    while(fs.existsSync(mkdirName + fileSubName+"("+j+")"+fileType)) {
                        j+=1;
                    }
                    fileName=fileSubName+"("+j+")"+fileType;
                }
                session.privateConversationData.userRequest.attachments[session.privateConversationData.attachmentsToWrite]={name: fileName, contentUrl: mkdirName, contentType: session.privateConversationData.userRequest.attachments[session.privateConversationData.attachmentsToWrite].contentType};
                fs.writeFile(mkdirName + fileName, body, {encoding: 'binary'}, function(err) {//{encoding: 'binary' , flag: 'wx'} 
                    if(err) {
                        logger.error("/attachFile/F2: unable to save file. Error->  "+err);
                        session.endConversation("Unable to store your attachments. Sorry for the inconvenience, please try again.");
                    }
                    else {
                        session.privateConversationData.attachmentsToWrite-=1;
                        if (session.privateConversationData.attachmentsToWrite<0) {
                            session.beginDialog("/textRequest");
                        }
                        else {
                            writeFileRequest(session,mkdirName);
                        }
                    }
                });
            }
        });
    }
}

Note that session.privateConversationData.userRequest has been set earlier, not in this dialog as session.privateConversationData.userRequest=session.message;
and the code since is simply:
if (session.privateConversationData.userRequest.attachments.length) {
                session.beginDialog("/attachFile");
            }
            else if (session.privateConversationData.userRequest.text.length) {
                session.beginDialog("/textRequest");
            }
            else {
                session.endConversation('');
            }
so this object is just session.message without any modification.
As for the Jwt Token, this is set inside the root dialog:

connector.getAccessToken(function(err,token){
                if (!err){
                    session.privateConversationData.JWToken=token;
                }
                else {
                    logger.error(">Initializing< Get access token for attachment download Error-> "+err);
                }
            });

Hope this helps!

nickgkan commented 7 years ago

Any news? Alternatively, is there any proposal for a different implementation? I wonder if there is any way to skip the GET request to the external link, basically, replace this request with a different approach.

mbertrait commented 7 years ago

Well,

I'm using GET requests both with Skype and Facebook to download Image from users. I share my code with you if it can help you. Put it in a js file (downloader.js for example) and import it in your js file where you need it. The arguments are :

Personnally, I'm calling the function as follow :

    function (session, result) {
        var file_name = session.userData.id+'_'+session.userData.name+'.jpeg';
        mkdirp('tmp', function(err) {
            download(arg.connector, session.message, file_name, true);
        });
    }

Save the following code in a separate js file and import (require) it in another js file:

"use strict";
var fs = require('fs');
var request = require('request');
var progress = require('request-progress');
var async = require('async');
var url = require('url');
var ftpSender = require('../helpers/ftp-client');

function downloadAttachments(connector, message, callback) {
        var attachments = [];
        var containsSkypeUrl = false; // Is the attachments was sent through Skype
        message.attachments.forEach(function (attachment) {
            if (attachment.contentUrl) {
                attachments.push({
                    contentType: attachment.contentType,
                    contentUrl: attachment.contentUrl
                });
                if (url.parse(attachment.contentUrl).hostname.substr(-"smba.trafficmanager.net".length) == "smba.trafficmanager.net") {
                    containsSkypeUrl = true;
                }
            }
        });
        if (attachments.length > 0) {
            async.waterfall([
                function (cb) {
                    if (containsSkypeUrl) {
                        connector.getAccessToken(cb);
                    }
                    else {
                        cb(null, null);
                    }
                }
            ], function (err, token) {
                    var buffers = "";
                    async.forEachOf(attachments, function (item, idx, cb) {
                        var contentUrl = item.contentUrl;
                        var headers = {};

                        if (url.parse(contentUrl).hostname.substr(-"smba.trafficmanager.net".length) == "smba.trafficmanager.net") {
                            headers['Authorization'] = 'Bearer ' + token;
                            headers['Content-Type'] = 'application/octet-stream';
                        }
                        else {
                            headers['Content-Type'] = item.contentType;
                        }
                        request({
                            url: contentUrl,
                            headers: headers,
                            encoding: null
                        }, function (err, res, body) {
                            if (!err && res.statusCode == 200) {
                                buffers = body;
                            }
                            cb(err);
                        });
                    }, function (err) {
                        if (callback)
                            callback(err, buffers);
                    });
            });
        }
        else {
            if (callback)
                callback(null, null);
        }
    };

/**
 * This function download and save the image sent by the user throught the bot to the tmp folder.
 * the file name will be as follow : userID_userName.jpeg
 */
module.exports = function (connector, message, file_name, send_ftp) {
        downloadAttachments(connector, message, function (err, buffers) {
                fs.writeFile("tmp/"+file_name, buffers, function(err) {
                            if(err)
                                return console.log(err);
                            console.log("The file was saved!");
                            if (send_ftp)
                                ftpSender(file_name);
                });
        });
}

Hope it will help you.

nickgkan commented 7 years ago

Dear @MBbrainsonic thank you for the response! Sorry for my late response, I've been extensively testing this part of the code and even tried yours. It seems like the corrupted files' issue is solved, as it was an encoding problem, so your code really helped. I had to try it with a colleague's account though, because personally, I keep getting the 500 error from the server. Speaking in terms of your own code, my problem is at this part of the code:

request({
                            url: contentUrl,
                            headers: headers,
                            encoding: null
                        }, function (err, res, body) {
                            if (!err && res.statusCode == 200) {
                                buffers = body;
                            }
                            cb(err);
                        });

I receive a status code of 500 and the error body (err in your code) is null, same as in #3623 .

So I experimented a lot with this part. What I found interesting is that 6 out of 8 users (including myself) receive an error 500, while the other two receive a status 200. I was able to reproduce it with Postman too. Could there be any relation with the users' state? The different result per user is very strange. I tried to clean up users' state with no gain. Hope the issue is more specific now and someone could help!

nickgkan commented 6 years ago

Update: It is more than two weeks now that I do not observe the error with status code 500 but a new one with status code 403. Is there anything we have to change to make it work now that the internal server error seems to have been fixed?

nickgkan commented 6 years ago

It seems that the issue is now resolved. I will close this GitHub thread.

evgeniy-glushin commented 5 years ago

The issue is still around when I send a video message I get 500. On the other hand it works fine when I send a video file as a file attachment. Please let me know if you have any ideas how to fix this.