Open denilsonsa opened 9 years ago
You can do this via command-line args as well, it might be simpler. For example, put this into join.nip2
:
#!/home/john/vips/bin/nip2 -s
main
= foldl1 join_lr images
{
images = map Image_file (tl argv);
}
Then:
$ ./join.nip2 -o x.tif join/1*.tif
$ vipsheader x.tif
x.tif: 24000x2500 uchar, 3 bands, srgb, tiffload
You could also split your function into chunks. I think (from memory) that limit is for a single definition, there's no limit for a file. You could write:
foo1 = [a, b, c, d, e, ...];
foo2 = [aa, ab, ac, ad, ...];
foo = foo1 ++ foo2;
As long as each single foon
is under the limit, you'll be OK. I think!
One more thing, when nip2 opens a file, it has to make sure it has random access to the pixels, since it doesn't know how you might end up using the image. If the image is not in a format which supports random access, nip2 will convert it for you behind your back. It makes a set of vips (.v
) files in its temporary area.
This means that if you are using it to join a large number of HUGE files (as I think you are, is that right?) it's going to copy each file as it starts. This could be slow, and will certainly use a lot of disc space.
If you use Python, you can give an access hint on open (that stackoverflow example does this) saying that you only need a simple top-to-bottom scan of the pixels. In this case, libvips can do the join with no intermediate images, for quite a good speedup.
I would write it in Python, but installing the required dependencies will require some work (I wish it were a simple straight-forward apt-get
).
Also, this has been a great opportunity to write stuff in a functional language, as I haven't done that since… well… forever. :)
Is there any way to set the file access to SEQUENTIAL_UNBUFFERED
in nip2?
Right now I've managed to:
read
),foldr
and a function that calls im_insert
to a list of all images and coordinates).Unfortunately, when the number of files is around 5475 (it works for 1406 files), the script is aborting:
error calculating "root.main"
No value.
Symbol "large_rgb_r" has no value.
error in "main" (/dev/stdin:209): Symbol "large_rgb_r" has no value.
error in "large_rgb_r" (/dev/stdin:210): C stack overflow. Expression too complex.
For now, I'm testing this on a bunch of files that were written by dzsave
. Later, I plan using this on a bunch of screenshots. If I ever hit the limit again, I'll figure out how to work around it. Either by manually compiling nip2 and increasing SPINE_SIZE
, or by using intermediate files.
There's an updated libvips in Debian experimental. Hopefully this'll get into Ubuntu soon. vips8 was released in May, so it's early days still.
Yes, the functional stuff is fun, isn't it? nip2 is a lazy functional language and libvips is a lazy image processing library, so they work rather well together, I think. You can apply a morphological operator to an image an infinite number of times and take the fixed point. Nice!
I guess you found parse_int
and parse_float
in nip2? I'd use insert
rather than im_insert
.
No, sorry, nip2 is a vips7 program and the SEQUENTIAL
stuff is all vips8. It's one of the reasons nip2 needs updating. There's a nip3 branch in git but it's not finished.
You're probably hitting a couple of limits. Most Unixes can't have more than 1024 files per process open at once, sadly. Most also have 2MB stack per thread, so as nip2 recurses down the spine of the graph, there's a limit to how complex you can get. You'd need to rework reduce_spine()
to shrink the stack frame size, or rework your expressions to shrink the spine. Or rework reduce_spine()
a lot to make it not recursive, perhaps by using pointer reversing.
You'll probably find you need to build your output image in sections and assemble the sections at the end.
I guess you found parse_int and parse_float in nip2?
Yep!
I'd use insert rather than im_insert.
Why?
Most Unixes can't have more than 1024 files per process open at once, sadly.
ulimit -a -S
and ulimit -a -H
report 32K open files limit. And, indeed, 8192 soft limit for stack (but no hard limit). However, nip2 is hitting its own internal limit, and not the external one (I tried increasing that limit, no luck).
Does nip2 employ tail call optimization? By your comment, I suppose it doesn't.
im_insert
calls the base vips7 function directly. insert
is a wrapper for im_insert
defined in _stdenv.def
which is suppoed to make it nicer to use. It puts the arguments in a more functional order, and works for more types of object, not just plain image pointers.
Ah, OK, sounds like it's not the file limit then, just the C stack. The 2MB limit is actually hardcoded:
https://github.com/jcupitt/nip2/blob/master/src/reduce.c#L1091
since I don't know of a way to get the C stack size. You could try raising that, though of course you could well start seeing mysterious crashes if memory is corrupted.
nip2's evaluation is a graph reduction machine using a variant on Turner's combinators (I use S, Sl and Sr instead of SKI), so (I think) tail recursion elimination isn't possible. The fixes might be keeping our own stack, or using pointer reversal to put the stack into the graph.
Hi again, 8.2 is in beta now and has a new operator which could help you:
It'll join an array of images up in a rectangular grid. I've been able to join 10,000 large TIFF files in a single operation, it should go higher than that. There are some improvements to TIFF memory use, and to C stack use too.
I have a bash script that generates a nip2 script with the following content:
It goes on and on for several thousand lines. Unfortunately, for a large enough input, the script hits a parser limitation in nip2:
This error is from parser.y, and
MAX_STRSIZE
is defined in lex.l.Sure, I'm going to work around this limitation in my script (by reading and parsing from an external file). But it would be great if this limitation was lifted. But I also understand it may require some non-trivial changes. So, I'd probably label this as enhancement.