zu1k / nali

An offline tool for querying IP geographic information and CDN provider. 一个查询IP地理信息和CDN服务提供商的离线终端工具.
https://github.com/zu1k/nali
MIT License
3.75k stars 340 forks source link

fix: we should not use bufio and to keep read and write behavior consistent #133

Closed mzz2017 closed 1 year ago

mzz2017 commented 2 years ago

After the fix PR #132, we found that in different mtr display modes, nali still has exceptions.

To reproduce it, try:

> mtr baidu.com | nali
# and then press d to switch the display mode

At the beginning, we fixed it by scanning control runes (rune<0x20). Although it fixed the most problems, we found there was also some abnormal cursor jitter.

Finally, we think we shouldn't use bufio because it uses cache to delay reads. We should unblock as soon as the write finish on the other side of the pipe ends, and return the read result, process it and print it on the screen, which is what our PR does.

Thanks co-debbuger @cubercsl

zu1k commented 2 years ago

Does this cause an IP to be split into two processing so that it cannot be identified?

mzz2017 commented 2 years ago

I do not know the application scenarios that print the half of an IP and then print the remain later.

zu1k commented 2 years ago

I do not know the application scenarios that print the half of an IP and then print the remain later.

io.Copy use a fix size buffer, this may cause unexpected truncation. This needs to be verified.

zu1k commented 2 years ago

This needs to be verified.

I generate a random txt file with (32*1024-2) bytes, then append 2 IPs.

# cat a.txt | nali
......... print something random .........
8iWJwiqGn6K625WveEQTIEYjBACg8C41vJis7dNqBQPOsCoFCfkwN2TRQ1PH
1.2.3.4
5.6.7.8 [德国] 

We can see the first IP 1.2.3.4 not pasrsed, which caused by unexpected truncation.

zu1k commented 2 years ago

I generate a random txt file with (32*1024-2) bytes, then append 2 IPs.

tr -dc A-Za-z0-9 </dev/urandom | head -c 32766  > a.txt
echo "" >> a.txt
echo 1.2.3.4 >> a.txt
echo 5.6.7.8 >> a.txt

cat a.txt | ./nali 
mzz2017 commented 2 years ago

I knew this problem. Golang use 4KB buffer to read from the file descriptor, and it is reasonable. If you insist on it, we can cache several bytes for the next read.

cubercsl commented 2 years ago

Confirmed that this issue also exists in the nodejs version (nali-cli), since the nodejs version uses a similar implementation.

mzz2017 commented 2 years ago

Full domain has a 253 chars limit. IPv4 has 4 chars and IPv6 16 chars.

mzz2017 commented 2 years ago

Or if you do not consider about the performance, we can also fallback to scan control chars.

zu1k commented 2 years ago

Looks like mtr using tui causes these issues, tui is not suitable to be piped, so I think it is a better choice to add a display method suitable for pipe to mtr.

mzz2017 commented 2 years ago

If we scan control chars, there is another problem. If the input program does not print the control char (like \r, \n and EOF), we will block on it.

zu1k commented 2 years ago

If the input program do not print the control char (like \r, \n and EOF), we will block on it.

This is the desired behavior, as if grep would do the same.

mzz2017 commented 2 years ago

We all know grep scans rows. So it works as expected.

zu1k commented 2 years ago

grep scans rows

nali is designed to scan lines too.

mzz2017 commented 2 years ago

OK. nodejs nali-cli performs as this PR. I think an argument is better, right?

zu1k commented 2 years ago

I think an argument is better

I think maybe we can add a param to specifically compatible with tui.

zu1k commented 2 years ago

It is not easy to handle tui. In addition to various control characters, it is also necessary to consider whether the original structure will be destroyed, especially some components of tui originally limited the length, and adding ip geo info will cause dislocation.

mzz2017 commented 2 years ago

What do you think about that I would try to fix the problem mentioned above by cache strategy. And we can see if it will be acceptable.

mzz2017 commented 2 years ago

You are right. We can only try to do it. It cannot be perfect.

zu1k commented 2 years ago

What do you think about that I would try to fix the problem mentioned above by cache strategy. And we can see if it will be acceptable.

bufio is already a cache strategy, can we do something on the top of bufio?

Or you can describe your caching strategy first, and we can first discuss whether it will introduce other issues, then we make a decision.

mzz2017 commented 2 years ago

It's pleasure to discuss with you. What about tonight? Sorry for that I have something to do now.

zu1k commented 2 years ago

I wonder if the cursor jitter you described earlier are caused by a \n or \r followed by other control characters?

zu1k commented 2 years ago

What about tonight? Sorry for that I have something to do now.

I will check the email notification regularly, you can discuss at any time, this is unlimited.

zu1k commented 2 years ago

I will create an issue, let's discuss there, so that others can search and participate.

zu1k commented 1 year ago

I will close this for long time no reply, if there are new developments, please discuss them in the issue.