schveiguy / iopipe

D language library for modular io
Boost Software License 1.0
77 stars 6 forks source link

line counter example with scatter/gather read/write. #24

Closed bioinfornatics closed 4 years ago

bioinfornatics commented 4 years ago

It is possible to provide a little example of cycling through the buffer to count a char as a \n ?

I tried this, but BufferedInputSource is not a Range

test 1

import iopipe.textpipe;
import iopipe.bufpipe;
import std.io;
import std.typecons;
import std.algorithm: filter, reduce;
import std.range: enumerate;

alias Char = CodeUnit!(UTFType.UTF8);

// open a file, detect the text encoding inside, and process
// lines that contain "foo"
void main(string[] args)
{
    File(args[1])               // open a file
         .refCounted            // File can't be copied
         .bufd!Char             // buffer it
         .push
         .filter!( u => cast(char)u == '\n')
         .enumerate(1)
         .reduce!("a + b");
}
cast(RefCountedAutoInitialize)0)), candidates are:
/opt/ldc2/1.18.0/include/d/iopipe/iopipe/bufpipe.d(557):        bufd(T = ubyte, Allocator = GCNoPointerAllocator, ulong optimalReadSize = 8 * 1024 / T.sizeof, Source)(Source dev)
  with T = char,
       Allocator = GCNoPointerAllocator,
       optimalReadSize = 8192LU,
       Source = RefCounted!(File, cast(RefCountedAutoInitialize)0)
  must satisfy the following constraint:
       is(typeof(dev.read(T[].init)) == size_t)
/opt/ldc2/1.18.0/include/d/iopipe/iopipe/bufpipe.d(564):        bufd(T = ubyte, Allocator = GCNoPointerAllocator, ulong optimalReadSize = T.sizeof > 4 ? 8 : 32 / T.sizeof)()

test 2

import iopipe.textpipe;
import iopipe.bufpipe;
import std.io;
import std.typecons;
import std.algorithm: filter, reduce;
import std.range: enumerate;

alias Char = CodeUnit!(UTFType.UTF8);

// open a file, detect the text encoding inside, and process
// lines that contain "foo"
void main(string[] args)
{
    File(args[1])               // open a file
         .refCounted            // File can't be copied
         .bufd!Char.push!( a => a.encodeText!(UTFType.UTF8)
                                 .outputPipe(openDev(1)))
         .filter!( u => u == '\n')
         .map!( c => 1)
         .reduce!("a + b");
}
Error: template iopipe.bufpipe.bufd cannot deduce function from argument types !(char)(RefCounted!(File, cast(RefCountedAutoInitialize)0)), candidates are:
/opt/ldc2/1.18.0/include/d/iopipe/iopipe/bufpipe.d(557):        bufd(T = ubyte, Allocator = GCNoPointerAllocator, ulong optimalReadSize = 8 * 1024 / T.sizeof, Source)(Source dev)
  with T = char,
       Allocator = GCNoPointerAllocator,
       optimalReadSize = 8192LU,
       Source = RefCounted!(File, cast(RefCountedAutoInitialize)0)
  must satisfy the following constraint:
       is(typeof(dev.read(T[].init)) == size_t)
/opt/ldc2/1.18.0/include/d/iopipe/iopipe/bufpipe.d(564):        bufd(T = ubyte, Allocator = GCNoPointerAllocator, ulong optimalReadSize = T.sizeof > 4 ? 8 : 32 / T.sizeof)()

Thanks

schveiguy commented 4 years ago

So it's on purpose that an iopipe is not a range. It's because the basic iopipe represents the entire stream, and it's not clear what you want to be range elements.

There is the asInputRange constructor which turns an iopipe into a range of window buffers.

I really should add something to convert to an input range of element types. The thing is I don't want to confuse algorithms by projecting a certain range mechanism.

I wrote this for someone at dconf to show how it would work:

import iopipe.bufpipe;
import iopipe.textpipe;
import iopipe.traits;
import std.range;
import std.utf;

struct ByElementRange(Chain)
{
    Chain chain;
    this(Chain chain)
    {
        this.chain = chain;
        if(chain.window.length == 0)
            cast(void)chain.extend(0);
    }
    auto front() { return chain.window[0]; }
    void popFront()
    {
        chain.release(1);
        if(chain.window.length == 0)
            cast(void)chain.extend(0);
    }
    bool empty() { return chain.window.length == 0; }
}

auto byElementRange(Chain)(Chain chain)
{
    return ByElementRange!Chain(chain);
}

void main()
{
    import std.algorithm;
    auto input = "some sentence I want to separate by words";
    auto wordrange = input.delimitedText(' ').asInputRange;
    assert(wordrange.equal(["some ", "sentence ", "I ", "want ", "to ", "separate ", "by ", "words"]));
    auto charRange = input.byElementRange;
    assert(charRange.take(6).equal(input[0 .. 6].byCodeUnit));
    assert(charRange.front == 's');
}
schveiguy commented 4 years ago

Oh, and you have an error in your iopipe chain. A file is typed as a stream of ubyte. You need to buffer in that form, and then wrap as an encoded stream.

e.g.:

File(args[1]).refCounted.bufd.assumeText!Char...

And also, if you wanted to count lines, you could use iopipe.textpipe.byLineRange

schveiguy commented 4 years ago

@bioinfornatics #25

bioinfornatics commented 4 years ago

thanks @schveiguy