tools4j / unix4j

An implementation of Unix command line tools in Java.
unix4j.org
MIT License
231 stars 43 forks source link

grep performance #80

Open jinceon opened 1 year ago

jinceon commented 1 year ago

I have a large file, 1.5GB. grep in command line cost 2 seconds. but use unix4j.grep it cost 31 seconds.

terzerm commented 1 year ago

can you provide some more details? some grep statements are faster than others. can you please share a code snippet?

jinceon commented 1 year ago

unix4j.grep(string, file).toStringList()

发自我的iPhone

------------------ Original ------------------ From: Marco Terzer @.> Date: Fri,Jun 16,2023 5:22 PM To: tools4j/unix4j @.> Cc: JinCeon @.>, Author @.> Subject: Re: [tools4j/unix4j] grep performance (Issue #80)

can you provide some more details? some grep statements are faster than others. can you please share a code snippet?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

terzerm commented 1 year ago

you can try with fixed string option, this should usually be faster:

Unix4j.grep(Grep.Options.fixedStrings, string, file).toStringList()
jinceon commented 1 year ago

it works, now is cost 14 seconds.

but it is still slower than native grep

发自我的iPhone

------------------ Original ------------------ From: Marco Terzer @.> Date: Fri,Jun 16,2023 6:44 PM To: tools4j/unix4j @.> Cc: JinCeon @.>, Author @.> Subject: Re: [tools4j/unix4j] grep performance (Issue #80)

you can try with fixed string option, this should usually be faster: Unix4j.grep(Grep.Options.fixedStrings, string, file).toStringList()

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

jinceon commented 1 year ago

snapshot