Closed bakul closed 5 years ago
I tried the above commands multiple times using kona.
Mostly, I got the same result as you, 1418 characters. A few times, I got 2836 characters (2 times 1418). Once, I got 14180 characters (10 times 1418).
In the file src/0.c
there is a function K _4d_(S srvr,S port,K y)
with the line
C buf[20000]; n=read(sockfd,&buf,20000); r=close(sockfd); if(r)R FE;
The first problem is that the buffer size is 20000.
It can't possibly yield the 51490 result that you got in k3.
The second problem is that that read
is yielding inconsistent multiples of 1418.
I have not figured out why, yet.
I suspect kona may be opening the socket in nonblocking mode, which would return with EAGAIN err in case there was no data ready to be read.
Yes, nonblocking mode might be the problem.
Another possibility from the linux man page:
read() attempts to read up to count bytes from file descriptor fd
into the buffer starting at buf.
On files that support seeking, the read operation commences at the
file offset, and the file offset is incremented by the number of
bytes read. If the file offset is at or past the end of file, no
bytes are read, and read() returns zero.
The key words are "attempts" and "offset". Does the target file support "seeking"? Maybe multiple reads are necessary.
There should no different for normal file read. This is related to sockets only. Seeking is not relevant. Assuming the socket is non-blocking, you'd have to continue reading until a read returns 0 bytes. Errors related to non-blocking (EAGAIN) should be handled. If you do the equivalent of a blocking read, you can not interrupt such a read (which can take a very long time, depending on the web page, network speed etc. so it should be possible to interrupt it).
Buffer length doesn't matter as in theory no amount of buffering may be enough. You just have to keep reading it and creating lines.
Just documenting what I found. I will continue to research this issue ... EAGAIN does not occur.
I modified the function _4d_
to print errno
5 times:
K _4d_(S srvr,S port,K y){
struct addrinfo hints, *servinfo, *p; int rv,sockfd; S errstr; I r;
memset(&hints,0,sizeof hints); hints.ai_family=AF_UNSPEC; hints.ai_socktype=SOCK_STREAM;
O("errno0: %d\n",errno);
if((rv=getaddrinfo(srvr,port,&hints,&servinfo))){fprintf(stderr,"conn: %s\n",gai_strerror(rv)); R DOE;}
O("errno1: %d\n",errno);
for(p=servinfo; p!=NULL; p=p->ai_next)
if((sockfd=socket(p->ai_family,p->ai_socktype,p->ai_protocol))==-1)continue;
else if(connect(sockfd,p->ai_addr,p->ai_addrlen)==-1){errstr=strerror(errno); r=close(sockfd); if(r)R FE; continue;}
else break;
if(p==NULL){fprintf(stderr, "conn: failed to connect (%s)\n",errstr); freeaddrinfo(servinfo); R DOE;}
I n=strlen(kC(y)); C msg[n+5]; I i=0; for(i=0;i<n+1;i++){msg[i]=kC(y)[i];}
msg[n]='\r'; msg[n+1]='\n'; msg[n+2]='\r'; msg[n+3]='\n'; msg[n+4]='\0';
if(write(sockfd, &msg, strlen(msg))==-1){r=close(sockfd); if(r)R FE; R WE;}
C buf[20000];
O("errno2: %d\n",errno); errno=0; O("errno3: %d\n",errno);
n=read(sockfd,&buf,20000);
O("errno4: %d\n",errno);
r=close(sockfd); if(r)R FE;
K z=newK(n==1?3:-3,n); memcpy(kC(z),&buf,n);
freeaddrinfo(servinfo);
if(n==0)R _n();
else R z; }
just before getaddrinfo
, errno
is 0.
getaddrinfo
sets errno
to 101 (ENETUNREACH), although getaddrinfo
succeeds (returns 0).
just before read
, errno
is still 101.
I reset errno
to 0 just in case read
also throws a 101 (and reprint errno
to verify the reset).
just after read
, errno
is still 0.
read
reports no error at all.
# `"google.com"`http 4:"GET /"
errno0: 0
errno1: 101
errno2: 101
errno3: 0
errno4: 0
8508
\\
In this case, 8508 characters were received (6 times 1418).
The code shows that the socket is not opened in non-blocking mode. If you replace the read logic with something like the following it should fix this problem.
C buf[20000]; n = 0;
do {
I n1=read(sockfd,&buf[n],sizeof buf-n);
if(n1==0)break;
if(n1<0){O("errno: %d\n",errno); break;}
n += n1;
} while(n<sizeof buf);
r=close(sockfd); if(r)R FE;
But this won't be enough as a website may send an arbitrary amount of data. Even for the test case 20,000 bytes is too small. You will need to allocate space as necessary.
Under kona
Under k3
The first 1418 seem similar (except for some per connection unique data) but after that kona gives up too soon.