Open ArsenArsen opened 2 years ago
I'm unable to reproduce this, unfortunately:
❯ while TRUE; cargo run -- build --input local-dev/test-configs/federalist.toml --output - 2> /dev/null | wc; end
1512 46242 1125456
1512 46242 1125456
1512 46242 1125456
1512 46242 1125456
however, I suspect that merging https://github.com/jameslittle230/stork/pull/272 will fix this issue. If you're able, could you pull that branch, build Stork locally, and retry your test?
[i] ~/stork 130 $ while :; do stork build --input - --output - 2>/dev/null <local-dev/test-configs/federalist.toml | wc --bytes; done | uniq
1124840
1125456
1125420
1125456
1124926
1125456
1124919
1125183
1125456
1125220
1125456
The above is unpatched. For some reason, the Federalist Papers example reproduces this issue a lot less (I had to use uniq to reduce the non-wrong result spam).
Patched:
[c] ~/stork$ while :; do ./target/debug/stork build --input - --output - 2>/dev/null <local-dev/test-configs/federalist.toml | wc --bytes; done
1125456
1125456
1125451
1125456
1125456
1124734
1125456
It'd appear flushing does not help (IIRC, I tried this myself after opening the issue anyways).
For some reason, though, STORK-262/fix-write-to-stdout
is insanely slow.
Please try this on a glibc system (such as the Debian Docker container) too.
It's probably worth noting that I'm using rustc 1.58.1 and cargo 1.58.0
PS: Is there some realtime communication channel? It'd likely be more ergonomic to test these kinds of weird issues that way
Weird - thanks for checking. I'll be sure not to merge #272 if it makes things too slow.
When you were reproducing it with your own config, was it producing index files that were bigger or smaller than the Federalist Papers example?
I'll keep working on a repro and check back in.
There's no chat set up for the project - I haven't had a need to spin something like that up yet, and I don't yet have a good sense for how useful it would be over Github issues and discussions. Happy to consider it, though - any suggestions?
Weird - thanks for checking. I'll be sure not to merge https://github.com/jameslittle230/stork/pull/272 if it makes things too slow.
Flush alone should't, I think this was just system load. I can't reproduce it now. Even when reverting the BTreeMap changes I only get a 13% increase in speed (builds per second).
When you were reproducing it with your own config, was it producing index files that were bigger or smaller than the Federalist Papers example?
I was under the impression I included my results - my bad! Considerably smaller.
381545
378037
381545
381523
I'll keep working on a repro and check back in.
This just gets weirder, I am now unable to reproduce it with the flush. This issue would seem to be fixed now? Well, at any rate, stdout not being flushed on exit seems like a Rust runtime bug too.
There's no chat set up for the project - I haven't had a need to spin something like that up yet, and I don't yet have a good sense for how useful it would be over Github issues and discussions. Happy to consider it, though - any suggestions?
I don't really have any special suggestions here, just the usual (Matrix or Libera.Chat; Zulip is also a thing some swear by but I haven't used it much). Whatever works for you works for me
Hello I'm experiencing the same as we're trying to implement Stork into Emanote (https://github.com/EmaApps/emanote/pull/327).
When using --output -
the index becomes "corrupted"/unusable.
To repro:
echo -e "[input]\nfiles = [" > stork.toml
while read -r file
do
echo " {path=\"$file\", url=\"$file\", title=\"$(basename "$file")\"}," >> stork.toml
done < <(find content -name '*.md')
echo "]" >> stork.toml
stork build -i stork.toml -o index-from-flag.st
stork build -i stork.toml -o /dev/stdout > index-from-stdout.st
stork build -i stork.toml -o - > index-from-dash.st
Attempt a search:
$ stork search -q foo -i index-from-flag.st
(large json output)
$ stork search -q foo -i index-from-stdout.st
(large json output)
$ stork search -q foo -i index-from-dash.st
thread 'main' panicked at 'split_to out of bounds: 679254 <= 679213', /home/kalle/.cargo/registry/src/github.com-1ecc6299db9ec823/bytes-1.1.0/src/bytes.rs:402:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
(Using Stork v1.5.0 btw)
@jilleJr - thanks for the repro steps. I'll take a look later today and report back what I find.
Related to #261. This makes it impossible to use stork as a filter. Notably,
/dev/stdout
does not have the same issue, implying this is an issue with how Rust opens stdout.