ClosestStorm / v8cgi

Automatically exported from code.google.com/p/v8cgi
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Enhancement: support for readline #97

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Reading long input a character at a time is inefficient and can be problematic 
when reading multibyte characters.  Could we please have a function to read a 
line from stdin.
Optional encoding and maximum string length parameters would be nice too.

Original issue reported on code.google.com by andy.bis...@gmail.com on 28 Sep 2011 at 11:12

GoogleCodeExporter commented 9 years ago
What API do you propose? An extension of existing system.stdin("\n") (or 
File::read) which reads bytes from standard input until the given string (a 
newline in this scenario) is encountered?

As for the encoding - the result of system.stdin/File::read is a (binary) 
Buffer, which can be converted to JS string using any encoding you want.

Original comment by ondrej.zara on 28 Sep 2011 at 7:32

GoogleCodeExporter commented 9 years ago
It would be useful provide a readuntil(char c) function in the c++ layer.
This would avoid having to build strings character by character in
javascript which is likely not efficient. 
I haven't yet tried to build v8cgi. I will have a go tomorrow and possibly
provide an implementation.

Original comment by andy.bis...@gmail.com on 28 Sep 2011 at 8:07

GoogleCodeExporter commented 9 years ago
You do not need to build strings character by character in JS... I am not sure 
what your use case is, but I am pretty certain that the current implementation 
does not force you to read input char-by-char.

This does not mean that having a readline facility is a bad idea, of course :)

Original comment by ondrej.zara on 28 Sep 2011 at 8:31

GoogleCodeExporter commented 9 years ago
Well I may have misunderstood the API but it seems that the only
options are to read to EOF or to read precisely n bytes.
The only way I can see to read until a specific byte is to read one at
time and to test each for the desired terminator value.

I can't find any instructions to build on Windows and running scons
fails with an error that snprintf is not found.
Having never used mingw before this may take some time to resolve.

Original comment by andy.bis...@gmail.com on 29 Sep 2011 at 9:20

GoogleCodeExporter commented 9 years ago
Yes, you can read N bytes, buffer the data and scan for the terminator. I 
understand that this might not be the most comfortable/performant way, so I am 
adding a readline functionality to TODO list.

I am not sure what causes the mingw-related snprintf bug (compiling on windows 
works normally for me), but I can highly recommend compiling v8cgi on some 
Linux - it is much easier, robust and straightforward.

Original comment by ondrej.zara on 29 Sep 2011 at 10:30

GoogleCodeExporter commented 9 years ago
It's actually somewhat worse than that.  If you never receive N bytes
then the read never terminates.  The only safe value for N to be sure
to catch the terminator is 1, hence building the string byte by byte.
I am presently working around the difficulty by prefixing every write
to stdin with the number of bytes so that I know how many to request.

Original comment by andy.bis...@gmail.com on 29 Sep 2011 at 12:05

GoogleCodeExporter commented 9 years ago
Ok I see how it works now.  In the description it says that stdin(0)
waits for EOF when in fact it reads the available characters.
This will allow for relatively efficient testing for delimiter characters.
Andy

On 29 September 2011 13:04, Andrew Le Couteur Bisson
<andy.bisson@gmail.com> wrote:

Original comment by andy.bis...@gmail.com on 29 Sep 2011 at 2:40

GoogleCodeExporter commented 9 years ago
It seems I was right the first time but I have been mislead by another error,
system.stdin(0) never returns (not even on an explicit EOF from the keyboard.

The following program just echoes keys but never reads anything.  It
certainly never prints an '*'

system.stderr('hello');
while(true) {
    bytes = system.stdin(0);
    system.stdout(bytes);
    system.stdout('*');
}

The intended behaviour (to return all available characters) would
resolve the difficulty that I presently have

Original comment by andy.bis...@gmail.com on 30 Sep 2011 at 9:58

GoogleCodeExporter commented 9 years ago
This program actually works (prints out the asterisk); you just have to press 
enter AND ctrl-d after the text. 

I am not exactly sure if this corresponds with the desired/documented/logical 
expectations; I will try to analyze the topic in more detail and let you know 
afterwards. 

Original comment by ondrej.zara on 1 Oct 2011 at 3:21

GoogleCodeExporter commented 9 years ago
That's because its using fread which stops when the buffer is full or
when an error or EOF occurs.  EOF (ctrl-D) is not a satisfactory
delimiter because it closes the stream.
To read until a specific character it will be necessary to fread a
byte at a time into the buffer.  This is much more efficient than
slowly growing a javascript string.
My particular use case is to feed messages via stdin to be processed
on the v8cgi instance and replied on stdout.  Since the messages are
not of fixed length then I need to read until a delimiter.

Andy

Original comment by andy.bis...@gmail.com on 1 Oct 2011 at 7:54

GoogleCodeExporter commented 9 years ago
After some hacking to get this project to build under Visual Studio
(building using MinGW proved far too complicated) I can
now offer a basic prototype of the extension that I require for my application.

system.cc

/**
 * Read characters from stdin until terminator
 * @param {int} count How many (default 1024)
 * @param {int} char  Terminator character as integer (default newline)
 */
JS_METHOD(_readuntil)
{
    v8cgi_App * app = APP_PTR;

    size_t count = 1024;
        int until = (int)'\n';

    if (args.Length() > 0 && args[0]->IsNumber())
        {
        count = args[0]->IntegerValue();
    }

        if (args.Length() > 1 && args[0]->IsNumber())
        {
        until = args[1]->IntegerValue();
    }

    size_t size = 0;

    size_t tmp;
    char * buf = new char[count];
        char ch;
    do {
            tmp = app->reader(&ch, 1);
                if(tmp == 1)
                        buf[size++] = ch;
    } while ((tmp == 1) && (size < count) && (ch != until));

        std::string data = std::string(buf, size);
    delete[] buf;
    return JS_BUFFER((char *) data.data(), size);
}

This could be tidied up to expect a single character string as the
second argument
and to provide some error messages but it's good enough to allow me to
continue with my application.

Andy

Original comment by andy.bis...@gmail.com on 12 Oct 2011 at 9:35

GoogleCodeExporter commented 9 years ago
Acually, both "stdin" and "stdout" will soon be refactored to represent a 
stream interface; therefore, you will do something like

system.stdin.read();
system.stdin.readLine();   // <-- this is what this issue is about :)

However, I cannot give you an ETA for this.

Original comment by ondrej.zara on 17 Oct 2011 at 6:24

GoogleCodeExporter commented 9 years ago
readLine implemented for system.stdin and File objects in r964.

Original comment by ondrej.zara on 14 Dec 2011 at 12:16

GoogleCodeExporter commented 9 years ago
I have had the opportunity to test this and it seems to work very well.
I particularly like the fact the the output functions return the
stream for method chaining.
I can now run the system that I am developing (multiple worker
processes including v8cgi processes managed by a node.js http server)
on r964 without modifications.

Original comment by andy.bis...@gmail.com on 14 Dec 2011 at 4:42

GoogleCodeExporter commented 9 years ago
Thanks for a confirmation!

Original comment by ondrej.zara on 15 Dec 2011 at 6:42