Open rebcabin opened 9 months ago
I also see a read_file on line 197 of lpython.cpp. I'm still learning what's going on.
The file should only be read once. If it is read twice, that's a bug that we need to fix. There are many file reads, but only one should be executed.
Looks like, during parsing, read_file is called at least twice, once to parse and once to [gs]et_module_path
, and maybe more than once for that purpose. Some of the following breaks are redundant, but not all of them are.
I set breakpoints on every one of these lines:
└─(11:13:31 on main ✹ ✭)──> find . "(" -iname "*.cpp" -o -name "*.h" ")" -print0 | xargs -0 -P 0 grep -n read_file ──(Sun,Feb04)─┘
./src/lpython/semantics/python_ast_to_asr.cpp:74: bool status = read_file(file_path, input);
./src/lpython/semantics/python_ast_to_asr.cpp:81: status = read_file(file_path, input);
./src/lpython/semantics/python_ast_to_asr.cpp:90: status = read_file(file_path, input);
./src/lpython/semantics/python_ast_to_asr.cpp:129: std::string input = read_file(infile);
./src/lpython/parser/parser.cpp:125: std::string input = read_file(infile);
./src/bin/lpython.cpp:90: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:103: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:140: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:161: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:172: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:197: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:250: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:295: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:346: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:399: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:443: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:491: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:592: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:707: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:762: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:872: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:945: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:1019: std::string input = LCompilers::read_file(infile);
./src/bin/lpython.cpp:1296:// std::string input = read_file(infile);
./src/libasr/utils.h:105:bool read_file(const std::string &filename, std::string &text);
./src/libasr/string_utils.h:29:std::string read_file(const std::string &filename);
./src/libasr/utils2.cpp:31:bool read_file(const std::string &filename, std::string &text)
./src/libasr/diagnostics.cpp:126: read_file(s.filename, input);
./src/libasr/codegen/asr_to_llvm.cpp:368: std::string input = read_file(infile);
./src/libasr/string_utils.cpp:106:std::string read_file(const std::string &filename)
./src/libasr/asr_utils.cpp:455: if (read_file(full_path.string(), modfile)) {
then I ran lpython --show-asr ./tests/expr8.py
. The debugger stopped several times on read_file
.
If so, we need to fix it.
marking enhancement
and good first issue
. It likely means some architectural "re-plumbing" of the front-end. Some file metadata or even the whole file itself will have to be cached somewhere.
I write a gdb script to have a dig into the problem. It will record the call stack when read_file
is called and print them into log file (gdb.txt
by default). You can use gdb -x ./read_file_count.gdb
to use this script:
# read_file_count.gdb
set logging enabled on
set pagination off
file ./src/bin/lpython
set args --show-asr ./tests/expr8.py
set $count=0
break LCompilers::read_file
commands
silent
set $count = $count + 1
printf "Called function: %s\n", "your_function_name"
printf "Here is the function input args: "
info args
printf "function call location: "
where 1
list
cont
end
r
set logging enabled off
In logging file, The read_file
is indeed called two times:
Called function: your_function_name
Here is the function input args: filename = "./tests/expr8.py"
function call location: #0 LCompilers::read_file (filename="./tests/expr8.py") at /home/zzh/lpython/src/libasr/string_utils.cpp:109
104 }
105
106 std::string read_file(const std::string &filename)
107 {
108 std::ifstream ifs(filename.c_str(), std::ios::in | std::ios::binary
109 | std::ios::ate);
110
111 std::ifstream::pos_type filesize = ifs.tellg();
112 if (filesize < 0) return std::string();
113
Called function: your_function_name
Here is the function input args: filename = "./tests/expr8.py"
function call location: #0 LCompilers::read_file (filename="./tests/expr8.py") at /home/zzh/lpython/src/libasr/string_utils.cpp:109
104 }
105
106 std::string read_file(const std::string &filename)
107 {
108 std::ifstream ifs(filename.c_str(), std::ios::in | std::ios::binary
109 | std::ios::ate);
110
111 std::ifstream::pos_type filesize = ifs.tellg();
112 if (filesize < 0) return std::string();
113
Maybe we can cache the read_file's result in a unordered_map<string,string>, and return the file content if the filename
has been cached?
Maybe we can cache the read_file's result in a unordered_map<string,string>, and return the file content if the filename has been cached?
Let's refactor the frontend to only call the read function once. I wonder what is going on.
Maybe we can cache the read_file's result in a unordered_map<string,string>, and return the file content if the filename has been cached?
Let's refactor the frontend to only call the read function once. I wonder what is going on.
Ok, let me explore the frontend's execution flow and try to figure out the root cause
repro:
lldb lpython -- --show-asr tests/expr8.py
first time it's read is line 128 of lpython.cpp
second time it's read is line 125 of parser.cpp
intentional?