proposal: How to add support of local testing for a new language?

j178 commented 1 year ago

我最近一直在思考一个更加统一、规范的 local testing 的实现，减少支持新语言所需的开发工作量，让各种语言的 local testing 表现更加一致。

以生成 C++ 版本的 two-sum 为例，我提议做出以下定义：

一个 Generator 应该生成什么？

question.md [optional]
testcases.txt
solution.h
solution_test.cpp

`question.md`

question.md 包含 Markdown 格式的题目描述，只有当开启 code.separate_descrption_file 时才生成这个文件。否则题目描述包含在代码文件中(solution.h)。

See #94 .

`testcases.txt`

testcases.txt 由多个 case 组成，case 之间由一个空行分隔。

case 包括 input 和 output 两部分，如果 input 参数有多个，每行放置一个。

示例：

target_case: 0

input:
[3,9,20,null,null,15,7]
output:
3

input:
[1,null,2]
output:
2

testcases.txt 中还可能包含一个可选的 Key-value: target_case: 0，表示只执行某一个 case。case 的编号从 1 开始，向下递增。

target_case: 0 中 0 表示执行所有 case.
也可以为负数，-1 代表最后一个 case.

`solution.h`

主要包含经过模板替换、modifier 修改后的 question code snippet，示例：

// 省略描述部分

// @lc code=begin

class Solution {
public:
    int maxDepth(TreeNode* root) {

    }
};

// @lc code=end

`solution_test.cpp`

solution_test.cpp 应该是一个完整的程序，每次调用执行一个独立的 case。

它从 stdin 读入完整的 case input
将 case input 反序列化为函数入参需要的类型，比如 list, list of list, ListNode, TreeNode
调用 solution function，获取 function return value
将 return value 序列化，从 stdout 输出

solution_test.cpp 只是逻辑上的概念，并不一定需要生成为单独的文件，可以与代码文件存在于同一个文件中。

示例：

// Code generated by https://github.com/j178/leetgo
#include <bits/stdc++.h>
#include "LC_IO.h"
#include "solution.h"

using namespace std;

int main(int argc, char **argv) {
    // scan input args
    TreeNode* root; cin >> root;

    // initialize object
    Solution *obj = new Solution();

    // call method
    auto &&res = obj->maxDepth(root);

        // print result
    cout << "output: " << res << endl;

    delete obj;
    return 0;
}

为避免用户自己打印到 stdout 的内容影响 output 的解析，return value 的 output 应该包含 output: 前缀，例如：

output: [0,1]

序列化和反序列化的功能实现应该存在于一个公共的 library 中，由 solution_test.cpp 引用。这个 library 应该由 Generator 的Initialize() 拷贝到 $outDir/common 中。library 的原始代码应该放在仓库的 testutils/$lang 目录下，通过 embed 打包在程序中。示例：

Click to expand

```cpp #ifndef LC_IO_H #define LC_IO_H #include #include /** * Definition for a singly-linked list. */ struct ListNode { int val; ListNode *next; ListNode() : val(0), next(nullptr) {} ListNode(int x) : val(x), next(nullptr) {} ListNode(int x, ListNode *next) : val(x), next(next) {} }; /** * Function for deserializing a singly-linked list. */ std::istream &operator>>(std::istream &is, ListNode *&node) { node = nullptr; ListNode *now = nullptr; L0: is.ignore(); L1: switch (is.peek()) { case ' ': case ',': is.ignore(); goto L1; case ']': is.ignore(); goto L2; default : int x; is >> x; now = (now ? now->next : node) = new ListNode(x); goto L1; } L2: switch (is.peek()) { case '\r': is.ignore(); goto L2; case '\n': is.ignore(); goto L3; case EOF : goto L3; } L3: return is; } /** * Function for serializing a singly-linked list. */ std::ostream &operator<<(std::ostream &os, ListNode *node) { os << '['; while (node != nullptr) { os << node->val << ','; node = node->next; } os.seekp(-1, std::ios_base::end); os << ']'; return os; } /** * Definition for a binary tree node. */ struct TreeNode { int val; TreeNode *left; TreeNode *right; TreeNode(int x) : val(x), left(nullptr), right(nullptr) {} TreeNode(int x, TreeNode *left, TreeNode *right) : val(x), left(left), right(right) {} }; /** * Function for deserializing a binary tree. */ std::istream &operator>>(std::istream &is, TreeNode *&node) { std::deque dq; L0: is.ignore(); L1: switch (is.peek()) { case ' ': case ',': is.ignore(); goto L1; case 'n': is.ignore(5); dq.emplace_back(nullptr); goto L1; case ']': is.ignore(); goto L2; default : int x; is >> x; dq.emplace_back(new TreeNode(x)); goto L1; } L2: switch (is.peek()) { case '\r': is.ignore(); goto L2; case '\n': is.ignore(); goto L3; case EOF : goto L3; } L3: int n = dq.size(); for (int i = 0, j = 1; i < n; ++i) { auto root = dq[i]; if (root == nullptr) { continue; } root->left = j < n ? dq[j] : nullptr; root->right = j + 1 < n ? dq[j + 1] : nullptr; j += 2; } node = n ? dq[0] : nullptr; return is; } /** * Function for serializing a binary tree. */ std::ostream &operator<<(std::ostream &os, TreeNode *node) { std::queue q; int cnt_not_null_nodes = 0; auto push = [&](TreeNode *node) { q.emplace(node); if (node != nullptr) { ++cnt_not_null_nodes; } }; auto pop = [&]() { auto front = q.front(); q.pop(); if (front != nullptr) { --cnt_not_null_nodes; push(front->left); push(front->right); os << front->val << ','; } else { os << "null,"; } }; os << '['; if (node != nullptr) { push(node); while (cnt_not_null_nodes > 0) { pop(); } os.seekp(-1, std::ios_base::end); } os << ']'; return os; } /** * Function for deserializing an array. */ template std::istream &operator>>(std::istream &is, std::vector &v) { L0: is.ignore(); L1: switch (is.peek()) { case ' ': case ',': is.ignore(); goto L1; case ']': is.ignore(); goto L2; default : v.emplace_back(); if constexpr (std::is_same_v) { is >> quoted(v.back()); } else { is >> v.back(); } goto L1; } L2: switch (is.peek()) { case '\r': is.ignore(); goto L2; case '\n': is.ignore(); goto L3; case EOF : goto L3; } L3: return is; } /** * Function for serializing an array. */ template std::ostream &operator<<(std::ostream &os, const std::vector &v) { os << '['; if constexpr (std::is_same_v) { for (auto &&x : v) { os << quoted(x) << ','; } } else if constexpr (std::is_same_v) { for (auto &&x : v) { char buf[320]; sprintf(buf, "%.5f,", x); os << buf; } } else { for (auto &&x : v) { os << x << ','; } } os.seekp(-1, std::ios_base::end); os << ']'; return os; } #endif ```

test generator

这是 leetgo pick 的工作，它负责：

初始化测试环境

包括拷贝 library 代码，初始化语言相关的 workspace (go mod init, cargo init, python -m venv 等)
生成 question.md、testcases.txt、solution.h、solution_test.cpp 文件

test runner

这是 leetgo test -L 的工作，它负责：

编译生成 executable (可选)

对于编译型语言，需要现将 solution_test.cpp 编译为 binary 才能执行。生成的 binary 应该放在 /tmp/leetgo/$questionSlug-$langSlug.exec 。
解析 testcases.txt 抽出每个 case 的 input，作为 stdin 调用 executable，捕获 executable 的 stdout 并解析出 output
比对 output 与 expected output（现阶段是直接的字符串比对，后续可能会支持更智能的判断），展示比对结果。

支持新语言需要的工作

用新语言实现参数的序列化、反序列化
实现 solution_test.$lang 的生成
实现 solution_test.$lang 的编译和调用

大家怎么看？欢迎一起讨论 @w43322 @frostming @acehinnnqru

frostming commented 1 year ago

关于solution_test.cpp 我理解你的目的，所有语言都变成一个 stdin -> program -> stdout的测试程序。抹掉了语言的差异，但这样有两个问题：

为了让用户可以专注于逻辑的实现，leetgo需要实现一个调用层

       ┌────────────────────────────────────────────┐
       │leetgo                                      │
       │                                            │
       │        ┌─────────────────────────┐         │
       │        │ Language native         │         │
  stdin│        │                         │         │stdout
──────►├────────┼────► class Solution─────┼────────►├─────►
       │        │                         │         │
       │        └─────────────────────────┘         │
       │                                            │
       └────────────────────────────────────────────┘

理想情况下用户只需要提供 class Solution，那么leetgo则负责完成(read stdin -> decode as language native)，每个语言需要实现一个互操作层

此外，这样把测试放到黑盒里跑了，不利于用户在IDE里打断点调试（指test runner）

frostming commented 1 year ago

不过话说回来，leetcode的测试程序基本也是按这样实现的(stdin, stdout) @yihong0618

yihong0618 commented 1 year ago

是的，一个 docker 的 runtime 但据说要换掉，但这个据说是我两年前听到的

j178 commented 1 year ago

理想情况下用户只需要提供 class Solution，那么leetgo则负责完成(read stdin -> decode as language native)，每个语言需要实现一个互操作层

read stdin -> decode as language native 这应该是 solution_test.cpp 自己做的，例如这样：

int main(int argc, char **argv) {
    // scan input args
    TreeNode* root; cin >> root;

    // initialize object
    Solution *obj = new Solution();

    // call method
    auto &&res = obj->maxDepth(root);

        // print result
    cout << "$output$" << res << "$output$";

    delete obj;
    return 0;
}

leetgo 的调用层应该是：

解析 testcases.txt 抽出每个 case 的 input，作为 stdin 调用 executable，捕获 executable 的 stdout 并解析出 output

是与语言无关的。

此外，这样把测试放到黑盒里跑了，不利于用户在IDE里打断点调试（指test runner）

因为 solution_test.cpp 是一个完整的程序，所以用户依然是可以打断点调试的。只是这种情况下没有 leetgo 提供 stdin input，用户需要自己复制粘贴 case 到控制台中。

w43322 commented 1 year ago

关于这个问题我有一些想法，但最近比较忙，下周我细说一下

j178 commented 1 year ago

这个 proposal 中的大部分设计都已经在 #116 中实现了，接下来我会来更新 #90 #111 来适配新的测试框架，谢谢各位~

w43322 commented 1 year ago

同意这个proposal，但有一个小问题，output wrapper是不是复杂一点比较好？ output: 可能还是会和用户的输出有冲突

j178 commented 1 year ago

想了想，把 output marker 改成了 @output:, 应该足够了，也不会特别奇怪，谢谢你的建议

j178 commented 1 year ago

或者用 @lc output，跟 code begin/end marker 一致

w43322 commented 1 year ago

1. 编译生成 executable (可选)
   对于编译型语言，需要现将 `solution_test.cpp` 编译为 binary 才能执行。生成的 binary 应该放在 `/tmp/leetgo_cache/$slug.exec` 。

请问临时目录的路径如何获取，目前项目里是否有这种函数呢

j178 commented 1 year ago

@w43322 Hi, 我在 #126 中添加了一个 getTempBinFile 函数，能帮忙 review 吗？

j178 / leetgo