masneyb / gftp

gFTP is a free multithreaded file transfer client for *NIX based machines. 56 language translations available.
http://www.gftp.org
MIT License
116 stars 21 forks source link

Crashes since 2.9.0b #144

Closed MartinX3 closed 2 years ago

MartinX3 commented 2 years ago

Arch Linux

Since 2.9.0b suddenly the directory browser becomes white and if I manually change the path the console tells me to wait for CHDIR.

Or it crashes after entering my password and pressing the enter key with a segfault. There is no error message, just Speicherzugriffsfehler (Speicherabzug geschrieben) in my bash terminal.

wdlkmpx commented 2 years ago

It looks like 2.9.0b is a massive fail haha, let me see what I can do 2.9.0b1 must be released as soon as possible

wdlkmpx commented 2 years ago

I fixed a segfault, please test the most recent commit, it might fix the issue for you

Although reverting the commit and trying to the reproduce the segfault again, it's no longer happening. This obscure behavior in C apps is called voodoo

MartinX3 commented 2 years ago

Thank you, this fixed the segfault issue. I recommend you switching to rust, it warns for some kind of bugs while compiling. It's a new language to learn, but it slowly replaces C in Android and the linux kernel. There is even an entire OS written in RUST. The code quality will rise.

The empty directory list issue sadly doesn't throw error messages. I only get

257 "/content/index" is your current location
Ordnerliste /content/index wird von der Gegenstelle geladen (LC_TIME=de_DE.UTF-8)
EPSV
229 Extended Passive mode OK (|||25283|)
MLSD
150 Accepted data connection
Chdir: Bitte klicken Sie zuerst auf »Stop«, bevor Sie etwas anderes machen

The last one appears, if I manually enter a new path in the path text field, after the bug appears.

wdlkmpx commented 2 years ago

Yeah one day I'll try Rust

I'm not able to reproduce the bug, I need more information. Server software name and version? Encrypted or unencrypted connection?

It looks like a connection issue, something that a firewall might be preventing from establishing

I see no issue connecting to these sites:

MartinX3 commented 2 years ago

I use the FTP mode and 2.8.0 works fine. While switching between 2.9.0 and 2.8.0 I can always reproduce it with 2.9.0

I use arch Linux Newest dependency system packages

Should I provide more informations? Currently I'm away from my computer.

wdlkmpx commented 2 years ago

I need information about the server, it might require a custom fix just like NetBSD-ftpd (the server name is NetBSD-ftpd, gFTP is the client name)

Is the connection encrypted or unencrypted? Does it happen with other sites?

I also need you to connect to ftp.bit.nl and ftp.netbsd.org and try to reproduce the bug using those sites. I see no issues.

You can also comment out this line, recompile the app and see what happens https://github.com/masneyb/gftp/blob/master/lib/protocol_ftp.c#L106 // { FTP_FEAT_MLSD, "MLST" },

wdlkmpx commented 2 years ago

I got a strange segfault that only happened once, and I guess IP_version was not NULL, but was pointing to an invalid memory location

Something strange may happen when running a new gFTP version with an old config from a previous version in $HOME/.gftp/. I use a live distro that has an older gFTP version. Every once in a while I start over, and that happened today.

@RocketMan might want to test this in Solaris and report if something is broken. Perhaps MLSD should be disabled in the bugfix release or something

wdlkmpx commented 2 years ago

I found a bug that is not a bug, but it certainly breaks compatibility with pure-ftpd in BrokenClientsCompatibility mode

That is/was the default setting in a distro full of kennels. The default pure-ftpd package

pure-ftpd reports EPSV support but:

EPSV
500 Unknown command
Disconnecting from site localhost

There are 2 ways to deal with this: 1) Notify the user that the FTP server is broken, or maybe it's pure-FTPd in BrokenClientsCompatibility=yes , pure-ftpd -b, pure-ftpd --brokenclientscompatibility 2) Fall back to PASV and disable EPSV/EPRT/MLSD

I think I'll implement both methods, 1) notify about a broken FTP server and 2) use PASV/PORT/LIST instead Only method 1) is possible if using ipv6 protocol

This is one example of many things that can go wrong

wdlkmpx commented 2 years ago

I was able to reproduce the MLSD issue reported by @MartinX3, it only happens with pure-FTPd

I uncommented this debug line, which only prints a message https://github.com/masneyb/gftp/blob/master/lib/protocol_ftp.c#L188

And that fixed the issue for me, I know, it doesn't make sense at all. So adding printf (""); would fix the issue, but it will trigger a compiler warning

This is disconcerting, there are other servers that send all sorts of strings and lines, but the issue only happens with pure-FTPd, and the possible fix is as absurd as it gets

wdlkmpx commented 2 years ago

I think I fixed the connection issue that happens after PASV or EPSV

150 Accept ... and nothing happens

At some point ftp_response_read() got broken (permanent segfaults) and I couldn't fix it. So I devised an alternate method, using a single buffer and avoiding many dynamic allocations and reallocations, this is the method that has been improved..

see commit ac15fa887e871eff60196e5dc4889fbea3981ced

Please test the latest commit...

MartinX3 commented 2 years ago

wohoooo, I think you fixed it. Thank you for your hard work. <3

wdlkmpx commented 2 years ago

Thanks for the timely bug report

I'll produce a bugfix release maybe tomorrow, unless another pressing issue is found

RocketMan commented 2 years ago

I got a strange segfault that only happened once, and I guess IP_version was not NULL, but was pointing to an invalid memory location

@RocketMan might want to test this in Solaris and report if something is broken. Perhaps MLSD should be disabled in the bugfix release or something

Hello,

Built 2.9.0b, ran fine for me under Solaris. Restarted and got the segfault upon opening any site. Rebuilding with the patch fixed it.

Didn't do a through regression test, but overall looks fine in Solaris (with the patch).

wdlkmpx commented 2 years ago

I think I figure out why ftp_read_response() got broken and now I know, now I know, know I know, the earth will shake, in 2 will break, death all around around around around arounddd

Currently, the buffer used by ftp_read_response does not persist, and that is required when the server sends more than 1 numeric response at the same time.

It works fine. I've tested like 100 ftp servers, and it's ready for 2.9.1b, with some enhancements and sanity checks to support truly broken servers. But it's not (technically) correct.

MartinX3 commented 2 years ago

so a fix broke broken broke servers where I wonder why its owners not gonna go broke. (OVH YOU DAMN CHEAP LITTLE STARTUP COMPANY).

I hope the C Shinigamis will hunt them down, get their names and write them down in their notebooks.

Do you have technical details for me I could send this little startup toilet seat company?

wdlkmpx commented 2 years ago

Broken servers advertise support for features that they don't actually support: I found 3 public servers using pure-FTPd in a specific mode that breaks EPSV, ESPV is disabled and PASV is used for the rest of the session

Broken serves don't support LIST -al, I only found one, but I don't have a proper fix for that, because no numeric error is reported, the filelist contains the error message.

There are like 1000 public servers to test, but it takes a lot of time. GFTP works fine, but the logic to handle server responses is not correct

I could have fixed this months ago without implementing an alternate method, running git bisect, autogen and configure take longer than compiling the app, and the dying hard disk is trashed

That's why I'll replace autotools with my custom build system, all the projects I forked or maintain will suffer the same tragic fate

MartinX3 commented 2 years ago

Sounds like a reason to migrate to RUST. :P

wdlkmpx commented 2 years ago

Yeah memory management in C is a nightmare

It all started when I noticed a segfault in transfers in active connections, the problem is that gftp_get_line creates a buffer and destroys it when it has consumed all the lines or no data has been retrieved from the socket (I think). The app is multithreaded and some variables may be copied to new requests that create new connections. Voodoo happens under certain conditions. I made a silly mistake and ftp_read_response() started to segfault permanently

I fixed my alternate method which is more efficient but not as robust as gftp_get_line

The buffer should be created once and destroyed along the connection, the read function should reset the string \0 and the position rather than destroying the buffer,

So this is the correct logic

static int ftp_read_response (gftp_request * request, int disconnect_on_42x)
{
  //DEBUG_PRINT_FUNC
  // The retrieved string from the socket may contain 1 or more lines, 1 or more numeric codes
  //   but the function must return the first numeric code
  // The buffer should retain the rest of the string to be processed
  //   in the next call to this function
  // Only after processing the stored lines, the function must retrieve
  //   more data from the socket
MartinX3 commented 2 years ago

I'm glad people still fight these C Dragons with success. Sadly if I see a pointer my brain shuts down and screams for kotlin. 🤪