danog / MadelineProto

Async PHP client API for the telegram MTProto protocol
https://docs.madelineproto.xyz
GNU Affero General Public License v3.0
2.76k stars 619 forks source link

downloadToDir takes forever #1409

Closed humpataa closed 8 months ago

humpataa commented 10 months ago

The file that causes the issue seems to lock itself. Using this code:

if (isset($message['media']) && isset($message['media']['photo'])) {
if (!file_exists($message['id'])) mkdir($message['id']);
$img = $MadelineProto->downloadToDir($message['media'], $message['id']."/");
}

It creates two files in the target directory. I can see both of these files in the target directory (both size 0):

5773634797422362540_y_4.jpg.lock
5773634797422362540_y_4.jpg

This does not happen with all images, but now and then, resulting in server timeout (taking forever). Very annoying. Please advise.

danog commented 10 months ago

What do the logs say when the download gets stuck?

humpataa commented 10 months ago

Thanks for the quick reply. Don't know whats latest, doesn't have timestamp ... last lines are:

PingLoop, 268088803:        Ping DC 2.0
Session, 268088803:         Garbage collected 26 incoming messages in DC 2, 0 left
Session, 268088803:         Garbage collected 20 outgoing messages in DC 2, 0 left
UpdateHandler, 268088803:   Parsing updates (updateShort) received via the socket...
FeedLoop, 268088803:        Resumed update feed loop generic
UpdateHandler, 268088803:   Saving update of type updateChannelUserTyping
UpdateHandler, 268088803:   Parsing updates (updateShort) received via the socket...
FeedLoop, 268088803:        Resumed update feed loop generic
UpdateHandler, 268088803:   Saving update of type updateChannelUserTyping
SessionPaths, 268088803:    Waiting for exclusive lock of /var/www/vhosts/****/httpdocs/tele/session.madeline/safe.php.lock...
SessionPaths, 268088803:    Got exclusive lock of /var/www/vhosts/****/httpdocs/tele/session.madeline/safe.php.lock...
SessionPaths, 268088803:    Waiting for exclusive lock of /var/www/vhosts/****/httpdocs/tele/session.madeline/lightState.php.lock...
SessionPaths, 268088803:    Got exclusive lock of /var/www/vhosts/****/httpdocs/tele/session.madeline/lightState.php.lock...
APIWrapper, 268088803:      Saved session!
UpdateHandler, 268088803:   Parsing updates (updateShort) received via the socket...
FeedLoop, 268088803:        Resumed update feed loop generic
UpdateHandler, 268088803:   Saving update of type updateChannelUserTyping
danog commented 10 months ago

Make sure your php.ini has unlimited execution time and please try reproducing the issue to get more logs

humpataa commented 10 months ago

Have tried again. Have reset the logfile. It reaches almost 90k in size trying to get 15 messages, only a few of them have images. It already takes minutes until it times out, I will not make it run even longer, this is no option. Here again, the last lines of the log after it times out:

UpdateHandler, 268088803:   Saving update of type updateChannelUserTyping
UpdateHandler, 268088803:   Parsing updates (updateShort) received via the socket...
FeedLoop, 268088803:        Resumed update feed loop generic
UpdateHandler, 268088803:   Saving update of type updateChannelUserTyping
UpdateHandler, 268088803:   Parsing updates (updateShort) received via the socket...
FeedLoop, 268088803:        Resumed update feed loop generic
UpdateHandler, 268088803:   Saving update of type updateChannelUserTyping
SessionPaths, 268088803:    Waiting for exclusive lock of /var/www/vhosts/****/httpdocs/tele/session.madeline/safe.php.lock...
SessionPaths, 268088803:    Got exclusive lock of /var/www/vhosts/****/httpdocs/tele/session.madeline/safe.php.lock...
SessionPaths, 268088803:    Waiting for exclusive lock of /var/www/vhosts/****/httpdocs/tele/session.madeline/lightState.php.lock...
SessionPaths, 268088803:    Got exclusive lock of /var/www/vhosts/****/httpdocs/tele/session.madeline/lightState.php.lock...
APIWrapper, 268088803:      Saved session!
UpdateHandler, 268088803:   Parsing updates (updateShort) received via the socket...
FeedLoop, 268088803:        Resumed update feed loop generic
UpdateHandler, 268088803:   Saving update of type updateChannelUserTyping
UpdateHandler, 268088803:   Parsing updates (updateShort) received via the socket...
FeedLoop, 268088803:        Resumed update feed loop generic
UpdateHandler, 268088803:   Saving update of type updateChannelUserTyping
Session, 268088803:         Garbage collected 23 incoming messages in DC 2, 0 left
Session, 268088803:         Garbage collected 19 outgoing messages in DC 2, 0 left
UpdateHandler, 268088803:   Parsing updates (updateShort) received via the socket...
FeedLoop, 268088803:        Resumed update feed loop generic
UpdateHandler, 268088803:   Saving update of type updateChannelUserTyping
UpdateHandler, 268088803:   Parsing updates (updates) received via the socket...
SeqLoop, 268088803:         Was fed updates of type updates...
FeedLoop, 268088803:        Resumed update feed loop channel 1661098541
UpdateHandler, 268088803:   Saving update of type updateNewChannelMessage
UpdateHandler, 268088803:   Parsing updates (updateShort) received via the socket...
FeedLoop, 268088803:        Resumed update feed loop generic
UpdateHandler, 268088803:   Saving update of type updateChannelUserTyping
Session, 268088803:         Garbage collected 21 incoming messages in DC 2, 0 left
Session, 268088803:         Garbage collected 19 outgoing messages in DC 2, 0 left

Again it leaves two files being zero:

5773634797422362540_y_4.jpg.lock
5773634797422362540_y_4.jpg
humpataa commented 10 months ago

It seems to go on after the script actually timed out: the log keeps rising and rising. It's 263kb now. What is going on here?

humpataa commented 10 months ago

Had to restart webserver to stop things working in background. Very strange, very annoying. Could you please explain what you scripts are doing? How come it never stops? It will probably write to the logfile forever until space runs out. This is not good. Please fix this.

danog commented 10 months ago

This is not a bug, this is intended, the worker must always keep running in the background, the logfile size is limited to 10mb by default in the settings, after that it's cleared automatically.

humpataa commented 10 months ago

Okay, but still: why does writing of files locks itself? At least I believe this is the problem.

humpataa commented 10 months ago

I have tried changing the timeout but it doesn't seem to be relevant to downloadToDir:

$settings = (new danog\MadelineProto\Settings\Connection)
    ->setTimeout(5000);

What does the number mean anyway - seconds, milliseconds, hours? Please document.

I have also tried setting the logfile size, but this did not have any effect either:

$settings = (new \danog\MadelineProto\Settings\Logger)
    ->setType(Logger::FILE_LOGGER)
    ->setExtra('custom.log')
    ->setMaxSize(10);

What does the number mean - bytes, kilobytes, megabytes? Please document.

humpataa commented 10 months ago

I have identified the channel post that is causing the problem of taking forever when I try to download the attached image. I have a dump of the message data attached. Anything that looks strange? Please help, thank you! message_dump.txt

humpataa commented 10 months ago

Last lines of the log file:

Session, 268088803:         Garbage collected 0 outgoing messages in DC -2, 1 left
PingLoop:           Ping DC 2.0
SessionPaths, 268088803:    Waiting for exclusive lock of /var/www/vhosts/****/httpdocs/tele/session.madeline/safe.php.lock...
SessionPaths, 268088803:    Got exclusive lock of /var/www/vhosts/****/httpdocs/tele/session.madeline/safe.php.lock...
SessionPaths, 268088803:    Waiting for exclusive lock of /var/www/vhosts/****/httpdocs/tele/session.madeline/lightState.php.lock...
SessionPaths, 268088803:    Got exclusive lock of /var/www/vhosts/****/httpdocs/tele/session.madeline/lightState.php.lock...
APIWrapper, 268088803:      Saved session!
Session, 268088803:         Garbage collected 8 incoming messages in DC 2, 0 left
Session, 268088803:         Garbage collected 8 outgoing messages in DC 2, 0 left
Session, 268088803:         Garbage collected 1 incoming messages in DC 4, 0 left
Session, 268088803:         Garbage collected 1 outgoing messages in DC 4, 0 left
Session, 268088803:         Can't garbage collect upload.getFile with message ID 7280041799480677824 sent 2150 seconds ago in DC -2, no response has been received or it wasn't yet handled!
Session, 268088803:         Garbage collected 0 outgoing messages in DC -2, 1 left
Session, 268088803:         Garbage collected 7 incoming messages in DC 2, 0 left
Session, 268088803:         Garbage collected 7 outgoing messages in DC 2, 0 left
Session, 268088803:         Can't garbage collect upload.getFile with message ID 7280041799480677824 sent 2160 seconds ago in DC -2, no response has been received or it wasn't yet handled!
Session, 268088803:         Garbage collected 0 outgoing messages in DC -2, 1 left
humpataa commented 10 months ago

Any idea please?

mtalaeii commented 10 months ago

Any idea please?

Try increase upload and download parallel chunks in settings and try again!

danog commented 8 months ago

Should be all fixed