ocaml / ocaml

The core OCaml system: compilers, runtime system, base libraries
https://ocaml.org
Other
5.42k stars 1.1k forks source link

provide Unix.write_for_nonblock and Unix.read_for_nonblock #11992

Open craff opened 1 year ago

craff commented 1 year ago

With domains or async, it is more and more common to read and write on nonblocking sockets and current Unix.write/Unix.single_write/Unix.read are tailored for blocking sockets with

I tested on a http server implementation an optimized read/write and got between an overall 30% speed increase! this is a lot.

I propose to add code similar to this to unix.ml and its stubs (note: I separated the test for exception to allow using [@@noalloc], I am not sure this is worth it ?

external raw_nb_write : Unix.file_descr -> Bytes.t -> int -> int -> int = "caml_nb_write" [@@noalloc]

external write_error : unit -> 'a = "caml_write_error"

let nb_write fd buf ofs len =
  if ofs < 0 || len < 0 || ofs+len > Bytes.length buf then
    invalid_arg "single_write";
  let ret = raw_nb_write fd buf ofs len in
  if ret == -1 then write_error();
  ret

external raw_nb_read : Unix.file_descr -> Bytes.t -> int -> int -> int = "caml_nb_read" [@@noalloc]

external read_error : unit -> 'a = "caml_read_error"

let nb_read fd buf ofs len =
  if ofs < 0 || len < 0 || ofs+len > Bytes.length buf then
    invalid_arg "read";
  let ret = raw_nb_read fd buf ofs len in
  if ret == -1 then read_error();
  ret

and

CAMLprim value caml_nb_write(value fd, value buf, value vofs,
                                      value vlen)
{
  CAMLparam1(buf);
  long ofs, len;
  int ret;

  ofs = Long_val(vofs);
  len = Long_val(vlen);

  ret = write(Int_val(fd), &Byte(buf, ofs), len);
  CAMLreturn(Val_int(ret));
}

CAMLprim void caml_write_error() {
  CAMLparam0();
  caml_uerror("single_write", Nothing);
  CAMLreturn0;
}

CAMLprim value caml_nb_read(value fd, value buf, value vofs,
                                      value vlen)
{
  CAMLparam1(buf);
  long ofs, len;
  int ret;

  ofs = Long_val(vofs);
  len = Long_val(vlen);

  ret = read(Int_val(fd), &Byte(buf, ofs), len);
  CAMLreturn(Val_int(ret));
}

CAMLprim void caml_read_error() {
  CAMLparam0();
  caml_uerror("read", Nothing);
  CAMLreturn0;
}
nojb commented 1 year ago

Thinking out loud: would it make sense to query the OS to know whether the file descriptor is in non-blocking mode in the current Unix.write/Unix.read and avoid the runtime lock/buffer copy dance in that case?

xavierleroy commented 1 year ago

Thinking out loud: would it make sense to query the OS to know whether the file descriptor is in non-blocking mode in the current Unix.write/Unix.read and avoid the runtime lock/buffer copy dance in that case?

I was thinking of storing a blocking/nonblocking flag in the file descriptor itself, to avoid an extra system call.

However, I'm not sure it's always safe to not release the runtime lock when reading or writing to a file descriptor in non-blocking mode. File I/O never blocks (and never reports EWOULDBLOCK), yet reading or writing several gigabytes on a USB-2 key can take a long time, during which other threads cannot run and the OCaml 5 GC may not be able to synchronize.

However, you could argue that nobody in their right mind sets non-blocking mode on descriptors opened on regular files, and I might agree.

craff commented 1 year ago

I thought about a type parameter to distinguish blocking non blocking. This way the non blocking function will only accept non blocking FD while other function would accept anything.

Le 5 février 2023 06:56:38 GMT-10:00, Xavier Leroy @.***> a écrit :

Thinking out loud: would it make sense to query the OS to know whether the file descriptor is in non-blocking mode in the current Unix.write/Unix.read and avoid the runtime lock/buffer copy dance in that case?

I was thinking of storing a blocking/nonblocking flag in the file descriptor itself, to avoid an extra system call.

However, I'm not sure it's always safe to not release the runtime lock when reading or writing to a file descriptor in non-blocking mode. File I/O never blocks (and never reports EWOULDBLOCK), yet reading or writing several gigabytes on a USB-2 key can take a long time, during which other threads cannot run and the OCaml 5 GC may not be able to synchronize.

However, you could argue that nobody in their right mind sets non-blocking mode on descriptors opened on regular files, and I might agree.

-- Reply to this email directly or view it on GitHub: https://github.com/ocaml/ocaml/issues/11992#issuecomment-1418152192 You are receiving this because you authored the thread.

Message ID: @.***>

craff commented 1 year ago

This idea of type parameter is also interesting for socket option like

TCP_CORK : [`Linux] gen_sock_bool_option 
TCP_NOPUSH : [`Bsd] gen_sock_bool_option

and

type all = [`Linux|`Windows|`Bsd|...]
type sock_bool_option : all gen_sock_bool_option

for compatibility.

Then the type of setsockopt would be platform dependent and with the correct design we could have:

github-actions[bot] commented 8 months ago

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

OlivierNicole commented 7 months ago

We probably shouldn’t let this be closed as stale as, judging by the reactions, the idea of improving the performance in the case of non-blocking I/O seems worth considering.

craff commented 7 months ago

Xavier Leroy @.***> writes:

However, I'm not sure it's always safe to not release the runtime lock when reading or writing to a file descriptor in non-blocking mode. File I/O never blocks (and never reports EWOULDBLOCK), yet reading or writing several gigabytes on a USB-2 key can take a long time, during which other threads cannot run and the OCaml 5 GC may not be able to synchronize.

However, you could argue that nobody in their right mind sets non-blocking mode on descriptors opened on regular files, and I might agree.

Hello,

A very late reply: if you read several gigabytes (from a socket or a real file), you do so in a buffer with a limited size (64Kb, may be 1Mb ?), especially if you are writing a server that wants to handle 100000 req/s or more with some fairness.

The current version stack-allocates and copy a buffer of size UNIX_BUFFER_SIZE=64Kb and limit each read/write to that size. May be the actual limit of Unix.read/write shoud be mentionned in the documentation ?

With read/write for non_blocking the user will be responsible to not read/write too many bytes at once and this should be in the documentation.

Cheers, Christophe

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

-- Christophe Raffalli tél: +689 87 23 11 48 web: http://raffalli.eu my mails should pass DKIM/SPF tests/mes messages doivent passer les tests DKIM/SPF

craff commented 7 months ago

I am going to propose a PR one the line mentionned above + one change. 2 of the changes could be applied to several functions in the Unix module:

  1. mark with untagged the ofs, len parmeter and return value
  2. mark with noalloc, and raise the exception from another OCaml function
  3. avoid releasing the global lock.

I think I should do one PR with the new function. Any comments on the code before I do the PR. I for instance wonder if CAMLprim/CAMLreturn/CAMLparam1(buf) are usefull with noalloc. I guess yes with multicore, but I would like to be sure and the documentation is not that all clear.

The C code for single_write_for_non_blocking:

CAMLprim long caml_fast_single_write(value fd, value buf, long ofs,
                                      long len)
{
  CAMLparam1(buf);
  int ret;

  ret = write(Int_val(fd), &Byte(buf, ofs), len);
  CAMLreturn(ret);
}

CAMLprim value caml_byte_fast_single_write(value fd, value buf, value vofs,
                   value vlen) {
  CAMLparam0();
  CAMLreturn(Int_val(caml_fast_single_write(fd,buf,Int_val(vofs),Int_val(vlen))));
}

CAMLprim void caml_uerror(value msg) {
  CAMLparam0();
  caml_uerror(String_val(msg), Nothing);
  CAMLreturn0;
}

with the ocaml part being:

external raw_single_write : Unix.file_descr -> Bytes.t ->
                            (int [@untagged]) -> (int [@untagged]) -> (int [@untagged])
  = "caml_byte_fast_single_write" "caml_fast_single_write" [@@noalloc]

external uerror : string -> 'a = "caml_uerror"

let single_write_for_non_blocking fd buf ofs len =
  if ofs < 0 || len < 0 || ofs+len > Bytes.length buf then
    invalid_arg "single_write";
  let ret = raw_single_write fd buf ofs len in
  if ret == -1 then uerror "single_write_for_non_blocking";
  ret
gasche commented 7 months ago

I think that you do not need the CAMLparam and CAMLreturn here. In the first function, ret has type long so technically it should be CAMLreturnT(long, ret) I think -- if you keep the macros.

craff commented 7 months ago

I think that you do not need the CAMLparam and CAMLreturn here. In the first function, ret has type long so technically it should be CAMLreturnT(long, ret) I think -- if you keep the macros.

I did issue #13020 for the documentation for section 11.2, because I should know from the documentation.