humhub / cfiles

Module for managing files inside spaces and user profiles.
17 stars 19 forks source link

The link "files from the stream" doesn't work with a lot of files AND cause deadlocks! #230

Open Guillaume-Comtois opened 2 weeks ago

Guillaume-Comtois commented 2 weeks ago

I found quite a big problem in the mocule cfile. If you click the link files from the stream and you have too many files, the query will never load and worst, it never stop loading in the background even if you change space, or go anywhere or perform new actions like post and comment. This end up trigger a PDOException: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock

This line cause the infinit query : protected\modules\cfiles\widgets\FileList.php image

INFINIT SQL count query (count is supposed to be equal to 28299) I found out that it's the OR condition in red that cause the infinit query. Take note that this query has single backslashs, the original has double backslashs. image

yurabakhtin commented 2 weeks ago

@Guillaume-Comtois Thanks for the report. I have tested with 200 files but cannot reproduce the issue, I don't think the problem may be in the line (file.object_model = content.object_model, and I am not sure the SQL query will works slowly for more recrods, for for me the SQL looks correctly:

SELECT
    count(*)
FROM
    `file`
LEFT JOIN `comment` ON
    (file.object_id = comment.id
        AND file.object_model = 'humhub\\modules\\comment\\models\\Comment')
LEFT JOIN `content` ON
    (comment.object_model = content.object_model
        AND comment.object_id = content.object_id)
    OR (file.object_model = content.object_model
        AND file.object_id = content.object_id)
WHERE
    (`content`.`contentcontainer_id` = 2)
    AND (`content`.`state` = 1)
    AND ((`comment`.`object_model` = 'humhub\\modules\\post\\models\\Post')
        OR (`file`.`object_model` = 'humhub\\modules\\post\\models\\Post'))
ORDER BY
    `file`.`updated_at`

cfiles-pages

test-cfiles-count

ArchBlood commented 2 weeks ago

@yurabakhtin I'm not sure this will help but it seems to me that the OR statement may be the issue in some cases, not all, this becomes more of an issue when it comes to the database not being configured correctly to handle larger datasets. To handle such things we could replace the OR statement by splitting the query into two SELECT parts and combining them using UNION ALL which can mitigate the issue. But overall this is just a Band-Aid method, I'd overall suggest that @Guillaume-Comtois check to make sure that everything is being indexed correctly on their end and double check the database configurations.

Guillaume-Comtois commented 2 weeks ago

@yurabakhtin I'm not sure this will help but it seems to me that the OR statement may be the issue in some cases, not all, this becomes more of an issue when it comes to the database not being configured correctly to handle larger datasets. To handle such things we could replace the OR statement by splitting the query into two SELECT parts and combining them using UNION ALL which can mitigate the issue. But overall this is just a Band-Aid method, I'd overall suggest that @Guillaume-Comtois check to make sure that everything is being indexed correctly on their end and double check the database configurations.

image image

Guillaume-Comtois commented 2 weeks ago

I have 1 024 878 content records, 100k file records and the count query is supposed to return 29,000. The query is working until I add the OR conditon in red. Do I need to index a column?

ArchBlood commented 2 weeks ago

I have 1 024 878 content records, 100k file records and the count query is supposed to return 29,000. The query is working until I add the OR conditon in red. Do I need to index a column?

If the issue is with the cfile module then these are the tables to look at;

Screenshot_6

Currently I'm not seeing any issues on my end with the core file module, I've even tested using the develop branch of 1.17.0-dev and see no issues. Could you provide more information on your database type used? Does it meet the requirements for HumHub? Are you using MyISAM, if so I wouldn't recommend using it if you plan on running HumHub;

https://docs.humhub.org/docs/admin/troubleshooting#data-integrity https://docs.humhub.org/docs/admin/requirements#database