crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.39k stars 1.62k forks source link

File#write bad performance #1899

Closed sdogruyol closed 8 years ago

sdogruyol commented 8 years ago

I'm trying to find out why File#write is more than 5 times slower from IO::FileDescriptor#write.

Seems like File#write opens the file every time thus leading to major performance difference https://github.com/manastech/crystal/blob/master/src/file.cr#L418-L422

You can check this specific commit for more info https://github.com/kemalcr/kemal/commit/f66e9cd834c7caced50b7e16759458b22e1ffbe9#diff-01d78faca9cb05340b05f208a58ac8e9R38

Any ideas?

sdogruyol commented 8 years ago

Guess IO::FileDescriptor#write directly calls LibC#write unbuffered.

https://github.com/manastech/crystal/blob/0f4e5866db1d559fdb3c833e8de8a1f96ab7c957/src/io/file_descriptor.cr#L176

waj commented 8 years ago

In your case, it makes sense to keep the log file open instead of opening and closing the file every time.

From your description it seems that you expect File#write to keep the file open, but is that the desired behaviour? I don't think so. Also, it overwrites the entire file, which I think is not what you expect from a log file.

sdogruyol commented 8 years ago

@waj by default it's not the desired behaviour like you mentioned. But for performant logging i need a way to keep it open and avoid reopening every time.

By the way what's overwriting the file?

waj commented 8 years ago

File.write opens and close the file every time. And it also overwrites the entire file with the new content. That is not a bug, it's by design (and it's also how it works in Ruby as well).

You should keep an open file (with File.open) in order to have better write performance and keep appending to the file.

sdogruyol commented 8 years ago

Is there any way of appending to a file other than File#write ?

jhass commented 8 years ago

We need to get something straight here: File#write denotes the instance method write on the class File, the one you call by File.new(...).write(...). File.write denotes the class method write on the class File, the one you call by File.write(...). In fact File#write is actually IO#write, just like IO::FileDescriptor#write is.

Of course File.write opens, rewrites and closes the file, that's the entire reason for its existence, as a convenience wrapper around File.open(..., &.write(...)). If you want to keep the file open, use File.new and pass the instance around.

sdogruyol commented 8 years ago

@jhass That's exactly what i was doing here https://github.com/kemalcr/kemal/blob/0e23e8d9a33ae1cc25d2328440a737eb467960b4/src/kemal/logger.cr#L38

But the problem is that in this way the logs are not working as intended. Sometimes it's lost or ordered wrongly because of being unbuffered.

jhass commented 8 years ago

If you say so, I don't have the motivation to trace through your entire architecture atm.

Then looking at File.write is irrelevant and your assertion that File#write reopens the file on each write is wrong.

sdogruyol commented 8 years ago

I still think that https://github.com/manastech/crystal/issues/1781 may be the reason for lost and unordered logs

sdogruyol commented 8 years ago

@jhass actually my original assertion of File.write is correct. The notation i used File#write was wrong.

ozra commented 8 years ago

File.write should work as it does now afaic. @sdogruyol when you have lost output: do you mean upon app shutdown / crash, or just anywhere in the course of running?

sdogruyol commented 8 years ago

Both in the course of running and shutdown.

omninonsense commented 8 years ago

Have you tried flushing explicitly? From glancing at your code, I don't think there's any flushing.

class Kemal::Logger < HTTP::Handler

# ...
  def fush
    @handler.flush
  end
# ...
end

Or maybe try @handler.flush_on_newline = true in initialize, maybe?

It seems Crystal doesn't flush all buffers on exit; maybe I'm wrong, but to me it seems like only STDOUT and STDERR are flushed from looking at Kernel#exit.

Also, I haven't looked too closely at your code—correct me if I am wrong—but if you're using Fibers (ie spawn) or Threads, it shouldn't be surprising that your stuff is out of order, you will probably have to add some sort of lock, since I don't think Crystal will do that for you.

jhass commented 8 years ago

It's an instance of File, so File.write is completely irrelevant!

waterlink commented 8 years ago

What I see from the code of kemal:

You need to keep in mind, that both techniques reduce performance by a bit. I suggest benchmarking to understand if it is relevant slowdown or not.

Best Regards, Oleksii Fedorov, Sr Ruby, Clojure, Crystal, Golang Developer, Microservices Backend Engineer, +49 15757 486 476

On Sat, Nov 21, 2015 at 3:12 PM, Jonne Haß notifications@github.com wrote:

It's an instance of File https://github.com/sdogruyol/kemal/blob/master/src/kemal/logger.cr#L7, so File.write https://github.com/manastech/crystal/blob/master/src/file.cr#L418-L422 is completely irrelevant!

— Reply to this email directly or view it on GitHub https://github.com/manastech/crystal/issues/1899#issuecomment-158645409.

sdogruyol commented 8 years ago

Using flush sounds reasonable.

sdogruyol commented 8 years ago

Confirmed that using flush_on_newline solves the issue. It has around %10 - 15 performance penalty but much better than File#write

Thanks a lot :+1:

sdogruyol commented 8 years ago

@asterite is there a specific reason why flush_on_newline isn't true by default?

asterite commented 8 years ago

@sdogruyol It's slower than not doing it. The issue is really not using a lock in file writes, basically this is a dup of #1781