tleyden / open-ocr

Run your own OCR-as-a-Service using Tesseract and Docker
Apache License 2.0
1.33k stars 223 forks source link

HTTP request for passing file upload #69

Closed GeorgeAnanthSoosai closed 8 years ago

GeorgeAnanthSoosai commented 8 years ago

@tleyden -- Can you please help me how to pass the payload for image attachment ?

I've tried with image url string.. It was working good.. I not sure, how to pass the attachment for the api..

Please give me some example..

GeorgeAnanthSoosai commented 8 years ago

getting following error :

Error extracting multipart/related parts: Didn't expect to get this far

here is the header I used in my javascript code,

Content-Type:multipart/related; boundary=-----sBOUNDARY

payload (body)

-----BOUNDARY Content-Type: application/json
{'engine':'tesseract'} -----BOUNDARY
-----BOUNDARY Content-Disposition: attachment; Content-Type: image/* filename='attachment.txt'. data:image/jpeg;base64,/9j/4AAQSkZJRgABAgAAZABkAAD/7AARRHVja3kAAQAEAAAAUAAA/+4ADkFkb2JlAGTAAAAAAf/bAIQAAgICAgICAgICAgMCAgIDBAMCAgMEBQQEBAQEBQYFBQUFBQUGBgcHCAcHBgkJCgoJCQwMDAwMDAwMDAwMDAwMDAEDAwMFBAUJBgYJDQsJCw0PDg4ODg8PDAwMDAwPDwwMDAwMDA8MDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwM/8AAEQgNrBOIAwERAAIRAQMRAf/EAPgAAQEBAAEFAQEAAAAAAAAAAAABAgMFAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAf/2Q== -----BOUNDARY

tleyden commented 8 years ago

See notes on https://github.com/tleyden/open-ocr/issues/67

GeorgeAnanthSoosai commented 8 years ago

@tleyden Should I pass PNG as base64 format like I pass in here ?

data:image/jpeg;base64,/9j/4AAQSkZJRgABAgAAZABkAAD/7AARRHVja3kAAQAEAAAAUAAA/+4ADkFkb2JlAGTAAAAAAf/bAIQAAgICAgICAgICAgMCAgIDBAMCAgMEBQQEBAQEBQYFBQUFBQUGBgcHCAcHBgkJCgoJCQwMDAwMDAwMDAwMDAwMDAEDAwMFBAUJBgYJDQsJCw0PDg4ODg8PDAwMDAwPDwwMDAwMDA8MDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwM/8AAEQgNrBOIAwERAAIRAQMRAf/EAPgAAQEBAAEFAQEAAAAAAAAAAAABAgMFAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAf/2Q==

GeorgeAnanthSoosai commented 8 years ago

Please share me the real example of passing an image..

GeorgeAnanthSoosai commented 8 years ago

@tleyden -- should I pass the image data as base64 ?

tleyden commented 8 years ago

Take a look at https://github.com/tleyden/open-ocr/issues/67#issuecomment-248729846

GeorgeAnanthSoosai commented 8 years ago

I'm not sure what should we pass it on the body of the payload..

var request = new XMLHttpRequest();

request.open('POST', 'http://openocr.yourserver.com/ocr-file-upload');

request.setRequestHeader('Content-Type', 'multipart/related; boundary=---BOUNDARY');

request.onreadystatechange = function () { if (this.readyState === 4) { console.log('Status:', this.status); console.log('Headers:', this.getAllResponseHeaders()); console.log('Body:', this.responseText); } };

var body = "-----BOUNDARY \ Content-Type: application/json \ \ {'engine':'tesseract'} \ -----BOUNDARY \ \ -----BOUNDARY \ Content-Disposition: attachment; \ Content-Type: image/png \ filename='attachment.txt'. \ \ PNGDATA......... \ -----BOUNDARY";

request.send(body);

what should i pass PNGDATA......... \ ? -- Is this base64 image data ?

GeorgeAnanthSoosai commented 8 years ago

@tleyden -- Can you please give me one example ? -- I couldn't upload from file which is the one I expected to do now... Please help me on this..

My question here is :

1. body payload construction

2. How do I construct image in the body section ?

I'm not sure, this will support invoking from javascript..

getting several issue.. here is the new issue I've if i provide the payload for upload from javascript,

Error: socket hang up at createHangUpError (_http_client.js:252:15) at Socket.socketOnEnd (_http_client.js:344:23) at emitNone (events.js:91:20) at Socket.emit (events.js:185:7) at endReadableNT (_stream_readable.js:974:12) at _combinedTickCallback (internal/process/next_tick.js:74:11) at process._tickCallback (internal/process/next_tick.js:98:9)

Please give me one example for javascript end.. that will much helpful.. Thanks in advance...

GeorgeAnanthSoosai commented 8 years ago

Please give me one working example with file upload.. this will be much helpful..

GeorgeAnanthSoosai commented 8 years ago

I tried https://github.com/tleyden/open-ocr-client this.. it is working great!.. i need to do the samething in Javascript.. Can you please give me the working samples in javascript ?

GeorgeAnanthSoosai commented 8 years ago

Does this read base64 image data ? -- Can you please give me your input on this ?

GeorgeAnanthSoosai commented 8 years ago

Do you happened to to the error message :

runtime error: invalid memory address or nil pointer dereference

? ?

alex-doe commented 8 years ago

I know this won't help you direct. I can give you an example how to do that in c#. https://github.com/alex-doe/open-ocr-dotnet/blob/master/OpenOcrDotNet/Services/RestService.cs

The hard part of writing my c# lib was figuring out how the multipart request works. I think you had to figure this out, like me, on your own. Maybe someone on Stackoverflow can help you!

tleyden commented 8 years ago

@alex-doe cool!! I just added a link on https://github.com/tleyden/open-ocr/blob/master/README.md#client-libraries

GeorgeAnanthSoosai commented 8 years ago

@alex-doe @tleyden -- Can you please help me to solve this issue using javascript..? thanks in advance.!!!

GeorgeAnanthSoosai commented 8 years ago

I tried with the following way,

genMultipart(image, beforeContent, boundary) {

    var raw = atob(image);
    var rawLength = raw.length;
    var imageBinary = new Uint8Array(new ArrayBuffer(rawLength))
    for(i = 0; i < rawLength; i++) {
      imageBinary[i] = raw.charCodeAt(i);
    }
    var before = [beforeContent, "\n\n"].join('');
    var after = '\n' + boundary;
    var size = before.length + imageBinary.byteLength + after.length;
    var uint8array = new Uint8Array(size);
    var i = 0;

    // Append the string.
    for (; i<before.length; i++) {
      uint8array[i] = before.charCodeAt(i) & 0xff;
    }

    // Append the binary data.
    for (var j=0; j<imageBinary.byteLength; i++, j++) {
      uint8array[i] = imageBinary[j];
    }

    // Append the remaining string
    for (var j=0; j<after.length; i++, j++) {
      uint8array[i] = after.charCodeAt(j) & 0xff;
    }
    return uint8array.buffer; // <-- This is an ArrayBuffer object!
  }

where before string is : "-----BOUNDARY \n \ \"Content-Type\": \"application/json\" \n \ \ {\"engine\":\"tesseract\"} \n \ -----BOUNDARY \n \ \ -----BOUNDARY \n \ \"Content-Type\": image/PNG \n \ \"Content-Disposition\": \"attachment; filename=attachment.txt\"."

And, called from below function :


let headers = new Headers({ 'Content-Type': 'multipart/related; boundary=---BOUNDARY'});
    let options = new RequestOptions({ headers: headers});

    let newData = this.genMultipart(pngBinaryData, data, "-----BOUNDARY");
    return this.http.post("/open-ocr/ocr-file-upload", newData, options)
      .toPromise()
      .then((result) => {
        console.log(result);
      }, (error) => {
        console.log(error);
      });

it still not working..

End with same error : runtime error: invalid memory address or nil pointer dereference

Any help on this.. ?

Image I used is base64 and converted into binary before calling the api..