metaeducation / rebol-issues

6 stars 1 forks source link

FIND fails on huge strings #1227

Open rebolbot opened 15 years ago

rebolbot commented 15 years ago

Submitted by: Sunanda

FIND fails with a memory error when searching large strings. (R2 does not have this issue).

It seems, in effect, that useful strings (one you can search etc) are limited to half the maximum length possible in R2.

msl: to-integer (2 ** 30)  ;; max string length for test
;; ungainly way of constructing a lengthy string:
bs: copy ""                ;; big string
while [(length? bs) < msl]
[print length? bs append bs mold system
0
441344
1324746
3093692
6638010
13745924
28019586
56740412
114702570
232188404
** Internal error: not enough memory
** Where: mold while
** Near: mold system

Now try searching our big string:

find bs "find"
== ** Internal error: not enough memory
recycle
== 7
recycle
== 1
recycle
== 0
find bs "find"
== ** Internal error: not enough memory
find/part bs "find" 1000
== ** Internal error: not enough memory

CC - Data [ Version: alpha 82 Type: Bug Platform: Windows Category: Unspecified Reproduce: Always Fixed-in:none ]

rebolbot commented 15 years ago

Submitted by: BrianH

I'm not sure this is a bug. You say that this happens in R2 as well, but for strings that are twice as long? That would be because strings in R3 are stored in 16-bits-per-char by default in R3, where they were 8-bits-per-char in R2. It's a Unicode thing.

Your example puts the system into an out-of-memory state before you even start using FIND, then expects FIND to work. When you are out of memory, a lot of things don't work. Do you expect FIND to work when you are already out of memory?

rebolbot commented 15 years ago

Submitted by: Carl

BrianH's points are valid.

Sunanda, can you print the length of the string prior to the failure?

It is also possible that there's a signed integer issue going on here.