Open qoega opened 3 years ago
hi @qoega,
Thanks for trying out ControlFlag. c_lang_if_stmts_6000_gitrepos.ts
is the dataset generated using repositories using C as a primary language. It should work for scanning projects using C++ language also. Although, it is more effective for scanning projects using C as their primary language.
I will try to reproduce the crash on my end. Just wanted to let you know that we have also released smaller training datasets for limited-memory devices (Although, memory capacity does not appear to be the issue behind this crash.)
I also encounter this bug. What is the current status regarding this one?
Thank you
Hi @xback,
Thanks for trying out ControlFlag. Did you try using a smaller version of the dataset? We have seen that most of these crash bugs are because of using larger datasets than the available memory on the system. Thanks.
Hi @xback,
Thanks for trying out ControlFlag. Did you try using a smaller version of the dataset? We have seen that most of these crash bugs are because of using larger datasets than the available memory on the system. Thanks.
Hi, The test ran on a system with 1TB of RAM (really) of which >900GB was free.
Hi @xback, Thanks for trying out ControlFlag. Did you try using a smaller version of the dataset? We have seen that most of these crash bugs are because of using larger datasets than the available memory on the system. Thanks.
Hi, The test ran on a system with 1TB of RAM (really) of which >900GB was free.
Thanks for info, @xback. Let us look into reproducing the issue. Would you mind pointing us the repository that you have been scanning using ControlFlag (if it is a public repository)? That can help us expedite the process. Thanks.
Would you mind pointing us the repository that you have been scanning using ControlFlag (if it is a public repository)?
Unfortunately, the repo is not public but I'll try to provide more details or a reproducer
Hi @xback, Thanks for trying out ControlFlag. Did you try using a smaller version of the dataset? We have seen that most of these crash bugs are because of using larger datasets than the available memory on the system. Thanks.
Hi, The test ran on a system with 1TB of RAM (really) of which >900GB was free.
Thanks for info, @xback. Let us look into reproducing the issue. Would you mind pointing us the repository that you have been scanning using ControlFlag (if it is a public repository)? That can help us expedite the process. Thanks.
Hi @xback, we scanned ClickHouse code using large version of the dataset, and the scan finished without any issues. In short, we do not see crash on our end. Please provide us a reproducer as per your convenience. Thanks.
Tried to check ClickHouse codebase, but it crashed. You can get ClickHouse codebase just from GitHub:
PS: c_lang_if_stmts_6000_gitrepos.ts was trained on C projects only or C++ as well? Did not find https://github.com/ClickHouse/ClickHouse in C++ projects list. It is written in C++ and has 20K stars/800 contributors.