关于投稿 Evaluatology 2024 参与交流评价学的事宜

Description

受詹剑锋教授邀请，共同来组织 The First International Workshop on Evaluatology (Evaluatology 2024) 活动，对于我们来说则是一个不错的系统构建开源评价学（Open Source Evaluatology，OSE）的机会。

他们的一篇文章 A Short Summary of Evaluatology 也是值得参考~

集思广益~

基于 ChatGPT，可以有如下输出：

Title:
Open Source Evaluatology: A Framework for Systematic Evaluation of Open Source Contributions

Abstract (English):
Building on the foundational principles of Evaluatology, Open Source Evaluatology proposes a comprehensive framework for evaluating contributions in open source ecosystems. This discipline addresses the decentralized and collaborative nature of open source projects by creating standardized evaluation metrics, benchmarks, and assessment tools. The framework focuses on both quantitative (e.g., code contributions, issue resolutions) and qualitative measures (e.g., community engagement, code quality), providing continuous, lifelong evaluation to encourage transparency, innovation, and sustainable development. Supported by robust models and automation, it aims to enhance project management and decision-making, promoting long-term participation and value creation within global open source communities.

This approach will be particularly valuable to developers, project maintainers, academic researchers, and organizations aiming to better understand and assess the impact of individual contributions.

Title (Chinese):
开源评价学：开源贡献系统化评价框架

摘要 (中文):
开源评价学基于评估学的核心原则，提出了一个全面框架，用于评价开源生态系统中的贡献。该框架针对开源项目的去中心化和协作性，建立了标准化的评价指标、基准测试及工具。它结合量化（如代码贡献、问题解决）和质化（如社区参与、代码质量）评价，提供持续和终身的评价，促进透明性、创新性及可持续发展。通过稳健的模型和自动化支持，该框架旨在提升项目管理、优化决策，并激励全球开源社区的长期参与与价值创造。

可以考虑在此基础上进行修改与完善~

继续来一个数据集方面的题目和摘要：

Title:
Establishing a Landmark Dataset Framework to Propel Open Source Evaluatology

Abstract (English):
This paper presents the creation of a pioneering dataset framework aimed at driving forward Open Source Evaluatology research. By integrating explicit and implicit problem definitions from a wide array of open source projects, the dataset is designed to ensure high-quality data through meticulous collection methods and rigorous technical analysis. The framework enhances data sharing, reuse, and reproducibility, offering tools for generating large-scale datasets that maintain data integrity. It provides a foundational infrastructure for robust evaluations, promoting transparency, innovation, and sustainable contributions in the open source community.

Title (Chinese):
构建推动开源评价学研究的里程碑式数据集框架

摘要 (中文):
本文提出了一个开创性的开源数据集框架，旨在推动开源评价学研究的发展。通过从广泛的开源项目中提取显性和隐性问题定义，该数据集通过严谨的数据收集方法和技术分析确保数据质量。该框架加强了数据共享、再利用和可重复性，提供生成大规模数据集的工具，同时保持数据的完整性。该框架为评价开源贡献提供了基础设施，促进透明性、创新性和可持续性贡献。

我将《开源软件通识：知识体系指南》中的框架作为输入，结合 CFP 的主题，让 GPT 给出可能的题目，结果如下：

根据你提供的框架，结合“开源评价学”主题，以下是针对每个主要部分设计的优化论文题目。这些题目反映了开源项目、社区、生态以及创新、贡献等概念在开源评价中的核心地位。

1. 基础

"开源评价学中的数字公共品：从开源对象到开源软件的评价标准"
"Digital Public Goods in Open Source Evaluation Science: Evaluation Standards from Open Source Objects to Software"
"开源治理在开源评价学中的作用：从企业到国家的治理模型"
"The Role of Open Source Governance in Open Source Evaluation Science: Governance Models from Enterprises to Nations"
"基于开源评价学的开源创新评价框架：价值驱动与贡献导向的结合"
"An Evaluation Framework for Open Source Innovation in Open Source Evaluation Science: Combining Value-Driven and Contribution-Oriented Approaches"
"开源贡献的评价方法：从代码贡献到社区参与的全面评估"
"Evaluation Methods for Open Source Contributions: A Comprehensive Assessment from Code Contributions to Community Engagement"

2. 项目

"开源评价学中的项目与代码评估：开源项目健康度与代码质量的双重标准"
"Project and Code Evaluation in Open Source Evaluation Science: Dual Standards for Project Health and Code Quality"
"开源许可证与代码评价：基于开源评价学的代码与许可证依赖分析"
"Open Source Licensing and Code Evaluation: A Dependency Analysis between Code and Licensing in Open Source Evaluation Science"
"开源供应链的风险与机会：开源评价学中的供应链评估模型"
"Risks and Opportunities in Open Source Supply Chains: A Supply Chain Evaluation Model in Open Source Evaluation Science"

3. 社区

"开源社区分类的评价模型：从用户型到项目型的开源社区健康评估"
"An Evaluation Model for Classifying Open Source Communities: Assessing Community Health from User-Centric to Project-Centric Communities"
"开源社区参与者的贡献评估：基于开源评价学的多维参与分析"
"Contribution Evaluation of Open Source Community Participants: A Multi-Dimensional Participation Analysis in Open Source Evaluation Science"
"开源社区治理与运营的评价框架：基于规则、基础设施和运营的系统评估"
"Evaluation Framework for Open Source Community Governance and Operations: A Systematic Assessment of Rules, Infrastructure, and Operations"
"开源社区评估：开源评价学中的健康与影响力评估标准"
"Community Evaluation in Open Source Evaluation Science: Health and Influence Assessment Standards"

4. 生态

"开源生态的构建与评估：企业、中立组织与开发者的协同效应"
"Building and Evaluating the Open Source Ecosystem: Synergistic Effects of Enterprises, Neutral Organizations, and Developers"
"商业开源的评价体系：基于开源评价学的企业贡献与创新评估"
"Evaluation System for Commercial Open Source: Assessing Enterprise Contributions and Innovations in Open Source Evaluation Science"
"开源基金会的评价机制：从社区支持到全球影响力的分析"
"Evaluation Mechanisms of Open Source Foundations: From Community Support to Global Influence"
"开源人才的成长路径评估：开源评价学中的贡献者培养与发展"
"Growth Path Evaluation of Open Source Talent: Contributor Development and Growth in Open Source Evaluation Science"
"开源教育的效果评估：开源评价学视角下的教育与贡献联动"
"Evaluating the Effectiveness of Open Source Education: The Link between Education and Contribution in Open Source Evaluation Science"
"开源活动的评价体系：从社区到技术活动的影响力分析"
"Evaluation System for Open Source Activities: Influence Analysis from Community to Technical Activities"

总结：

每个题目都结合了开源评价学与该框架中的核心要素，包括项目、社区、生态、创新、贡献等，旨在从评价学的角度探讨如何更好地衡量和评估开源项目及其相关要素。这些题目适合从开源治理、贡献者发展、社区健康、生态建设等不同层面展开研究。

为了实现开源评价学与《评价学：评价科学与工程》的融合和创新，接下来将重点参考《评价学》中的关键理论，结合开源领域的特性进行优化，真正构建跨学科的“开源评价学”体系。以下是具体的优化方案，涵盖了题目、摘要以及优化后的写作提纲。

优化后的题目：

-开源评价学：跨学科视角下的开源生态评价体系构建与实践路径 -开源评价学：构建全面的开源生态评价体系的理论框架与实践路径

优化后的摘要：

随着开源技术在全球科技创新与数字经济中的重要性日益增加，开源生态的复杂性和多样性对其评价体系提出了新的挑战。然而，现有的开源评价体系往往局限于特定维度，无法全面、科学地反映开源生态中的各类对象和其相互关系。

本文提出了“开源评价学”这一跨学科的新兴概念，结合了评价科学与工程的理论框架，构建了一个涵盖开源项目、技术、社区、贡献者、企业与供应链的综合评价体系。通过借鉴《评价学》中的关键方法论，本文从系统视角、领域视角和动态与演化视角，系统化地分析与评估开源生态的各个关键组成部分。基于科学的评价标准与数据驱动的模型，本文探索了开源评价的工程化与科学化路径。

我们进一步讨论了该评价体系在开源社区、企业、教育机构及政策制定中的实际应用，展示了开源评价学在推动开源生态可持续发展、技术创新和贡献者激励方面的实践价值。本文旨在通过跨学科的融合创新，为开源生态提供全面的评价工具与方法，促进全球科技与经济的协同发展。

优化后的写作提纲：

1. 引言

开源技术的重要性与开源生态的复杂性。
现有评价体系的局限性及其对开源项目、社区治理、技术创新的影响。
《评价学：评价科学与工程》中的评价理论对开源领域的启示。
提出“开源评价学”概念及其跨学科融合的目标。

2. 开源评价学的跨学科定义与目标

定义：开源评价学是结合评价科学与工程理论，专注于评估开源生态系统中各类对象的多维度贡献与影响力的学科。
跨学科目标：
- 结合评价科学中的定量与定性分析方法，建立数据驱动的开源评价模型。
- 从工程评价的角度，推动评价方法的系统化与工程化，实现自动化评价流程。

3. 开源生态评价体系的局限与挑战

局限于代码贡献：单一维度评价的不足。
多维度评估缺失：贡献者在社交、管理、社区治理中的作用未得到充分评估。
评价标准的多样性：各项目和社区缺乏统一的评价框架，导致结果难以比较。
忽视长期贡献：现有评价体系无法充分反映贡献者的持续贡献与参与。
缺乏科学评价方法：缺少数据驱动与系统化的评价模型。

4. 结合评价科学的开源评价学框架

系统视角：
- 基于系统工程理论，分析开源生态的关键组成部分（项目、贡献者、社区、供应链、企业与基金会）及其相互关系。
- 评价标准包括项目的健康度、贡献者的多维度表现、社区的治理模式及供应链的安全性。
- 将复杂系统评价理论引入开源生态的分析与决策中。
领域视角：
- 开源技术领域：结合工程评价中的性能和质量标准，评估开源技术的创新性和稳定性。
- 项目管理与供应链领域：从管理学角度，分析项目治理结构和供应链的安全与效率。
- 教育与人才培养领域：借鉴教育评价方法，评估开源项目对学生和专业人士的培养效果。
- 社会与文化领域：从社会学评价的视角，评估开源文化的传播、社区参与度及社会影响力。
动态与演化视角：
- 使用动态评价模型，研究开源项目生命周期、贡献者动态与技术演变。
- 将《评价学》中的趋势分析工具用于预测开源项目的发展方向和潜在风险。
- 评估社区适应性和创新能力，分析其在不断变化的技术和市场环境中的弹性。

5. 评价模型与方法论的构建

定量与定性结合的评价模型：
- 利用《评价学》中的方法论，构建定量分析与定性评估相结合的开源生态评价模型。
- 结合量化指标（如贡献度、代码质量、活跃度）与定性指标（如社区治理、用户反馈、文化贡献）进行综合评价。
数据驱动的评价工具：
- 应用大数据和机器学习技术，构建自动化开源项目评估工具，提供实时反馈。
- 借鉴《评价学》中的数据收集与分析方法，确保评价的全面性和准确性。

6. 实践路径：开源评价学的应用场景

在高校中的应用：
- 推动基于开源贡献的评价体系用于人才培养与学术评价，提升学生参与开源项目的积极性与实战经验。
在企业中的应用：
- 帮助企业识别高价值开源项目和优秀贡献者，优化研发流程，推动技术创新。
- 结合工程评价，分析企业在开源生态中的贡献与影响。
在开源社区中的应用：
- 通过评价工具提升社区治理的透明度与效率，激励贡献者长期参与，增强社区的活跃度与健康度。
在政策制定中的应用：
- 为政策制定者提供科学依据，帮助制定有利于开源生态发展的政策，促进开源技术与经济的融合。

7. 结论

总结开源评价学的理论框架与实践路径。
强调其跨学科融合对推动开源生态持续发展的重要性。
展望开源评价学的未来研究方向与工程实践潜力。

8. 未来展望

开源评价学的进一步发展方向，包括自动化评价工具的优化、跨领域应用的扩展。
如何通过评价科学与工程创新推动全球开源技术和经济的深度融合。

总结：

通过借鉴《评价学：评价科学与工程》的理论框架，我们将评价科学与开源领域深度融合，建立了“开源评价学”这一跨学科创新的评价体系。本文的框架不仅涵盖了开源生态中的系统组成、领域分析和动态演化，还结合了科学的评价方法与数据驱动工具，为开源项目管理、社区治理和技术创新提供了新的评价视角和实践路径。这种跨学科的融合为开源生态的可持续发展奠定了坚实的理论基础和实践指南。

咱们是否可以考虑：

开源评价学（Open Source Evaluatology） @PureNatural
开源计量学（Open Source Informetrics） @birdflyi
OpenPerf（开源评价基准） @bifenglin
其它建议？

开源评价学（Open Source Evaluatology） @PureNatural

已投

题目：Open Source Evaluatology: Theoretical Framework and Practical Pathways for Systematic Evaluation of Open Source Ecosystem

摘要：As the importance of open source technology continues to grow in global technological innovation and the digital economy, the increasing complexity and diversity of the open source ecosystem present new challenges for its evaluation. Existing evaluation systems are commonly limited to specific dimensions, failing to comprehensively and scientifically capture the various entities within the open source ecosystem and their interrelationships.

This paper introduces the interdisciplinary concept of "Open Source Evaluatology," which combines the theoretical frameworks of evaluation science and engineering. It proposes a comprehensive evaluation system that encompasses open source projects, technologies, communities, contributors, enterprises, and supply chains. Drawing on key methodologies from Evaluatology, this paper systematically analyzes and assesses the critical components of the open source ecosystem from a systems perspective, a domain perspective, and a dynamic and evolutionary perspective. Using scientifically established evaluation standards and data-driven models, we explore the pathways for engineering and scientific evaluation of the open source ecosystem.

Furthermore, we discuss the practical applications of this evaluation framework in open source communities, enterprises, educational institutions, and policy-making. We demonstrate the practical value of Open Source Evaluatology in promoting sustainable development of the open source ecosystem, fostering technological innovation, and incentivizing contributors. Through interdisciplinary innovation, this paper aims to provide comprehensive evaluation tools and methodologies for the open source ecosystem, contributing to the coordinated development of global science, technology, and economy.

title：Evaluating the Performance of Complex Textual Tasks Generated by Large Language Models abstract： This study aims to evaluate complex text generation tasks produced by large language models, proposing a systematic evaluation framework that encompasses four key stages: model fine-tuning, text retrieval, prompt design, and text generation. First, the pre-trained model is fine-tuned using large-scale datasets to enhance its adaptability and performance in generating complex text for specific tasks. Second, text retrieval is conducted using multi-source heterogeneous databases to ensure the diversity and relevance of the generated content. During text generation, well-designed prompt templates are employed to guide the model toward producing the desired output, thus ensuring the quality of the generated text. The evaluation framework is divided into four dimensions: (1) an assessment of the model fine-tuning, focusing on its capacity to improve generation performance; (2) an evaluation of the text retrieval process, emphasizing the support provided by the database in terms of throughput, computational efficiency, contextual precision, recall, and relevance; (3) an evaluation of prompt design, assessing its effectiveness in guiding the generated content; and (4) an evaluation of the generated text itself, analyzing its accuracy, coherence, and complexity.

The evaluation methodology includes automated testing and evaluation, A/B testing, and human evaluation. In automated evaluation, three approaches are applied: (1) statistical scoring, which quantifies the quality of the generated text using statistical metrics; (2) model-based scoring, which employs other evaluation models to assess the quality of the generated content; and (3) hybrid scoring, which integrates both statistical and model-based scoring for a comprehensive assessment. A/B testing compares the performance of different models or prompts, while human evaluation relies on expert or user feedback to complement the automated assessments. Through this comprehensive evaluation framework, this paper provides a thorough analysis and assessment of large language models in complex text generation tasks.

Title:

Constructing Benchmarks for Open Source Ecosystems: A Stakeholder Needs-Driven Approach

Abstract

As open source technology increasingly influences the advancement of worldwide innovations and the digital marketplace, the heterogeneity and intricacy of its ecosystem have also escalated, posing new challenges for benchmark construction tailored to open source environments. Traditional benchmarking approaches are often confined to specific static environments, assessing the effectiveness of methodologies through measured metrics. However, in open source contexts characterized by high degrees of freedom, strong liquidity, and broad interaction, the community dynamically generates new real-world needs daily. Consequently, benchmarks developed in static environments fail to effectively measure the value of methodologies within open source scenarios.

In response, this paper advocates for a method of constructing open source benchmarks driven by the actual needs of stakeholders within the open source ecosystem, who frequently articulate their needs through the challenges they encounter. By identifying and understanding these needs, open source benchmarks can be designed to closely align with the actual demands of stakeholders, thereby enhancing its relevance and effectiveness. This approach not only facilitates the direct application of benchmark results in real-world scenarios for immediate feedback and verification but also enhances the authenticity and credibility of the evaluation outcomes.

We propose a process for constructing open source benchmark: identifying the real needs of stakeholders within the open source ecosystem, translating these needs into specific tasks, creating benchmarks around these tasks, and applying the benchmark results to practical scenarios. This approach allows open source benchmarks to continuously adjust and improve based on real feedback, supporting the sustainable development of the open source ecosystem. This approach fosters a beneficial cycle of synergy between open source benchmark and stakeholder needs in the ecosystem, ensuring that benchmark originates from and is tailored to meet real-world needs.

Title

Open Source Informetrics: Theoretical Framework and Practical Path of Open Source Ecosystem

Abstract

As open source technology increasingly influences global technological innovation and the digital economy, the complexity and diversity of the open source ecosystem present new challenges for scientific measurement. Current measurement methods for open source are often limited to specific dimensions and fail to provide a comprehensive and scientific evaluation of the diverse node types and collaborative relationships inherent in the ecosystem.

This article introduces the interdisciplinary concept of "open source informetrics", which merges informetrics theory with the unique characteristics of the open source domain to develop a comprehensive measurement system. This system encompasses various entities, including actors, event actions, open source repositories, and code commits. The theoretical framework section explores the indicator types, analysis models, and evaluation systems associated with open source ecosystems and emphasizes the practical significance of these indicators in evaluating project management and community governance. In the empirical research section, this study analyzes the behavioral characteristics of participants in open source projects through systematic data collection and processing, evaluates performance indicators across different projects, and reveals their influence on project success. Furthermore, the article discusses the practical applications of the evaluation system within the open source community, illustrating the value of open source informetrics in promoting sustainable development of the open source ecosystem, technological innovation, and motivating contributors.

In conclusion, this article synthesizes the theoretical framework and practical pathways of open source informetrics, emphasizes the importance of interdisciplinary integration in promoting the sustainable development of open source ecosystems, and identifies future research directions and challenges. By deeply integrating informetrics with the field of open source, this article offers a new perspective and practical guidance for the management and technological innovation of open source projects.

题目：Evaluating Long-Term Usage Patterns of Open Source Datasets: A Citation Network Approach

摘要： The evaluation of datasets serves as a fundamental basis for tasks in evaluatology. Evaluating the usage patterns of datasets has a significant impact on the selection of appropriate datasets. Many renowned Open Source datasets are well-established and have not been updated for many years, yet they continue to be widely used by a large number of researchers. Due to this characteristic, conventional Open Source metrics (e.g., number of stars, issues, and activity) are insufficient for evaluating the long-term usage patterns based on log activity data from their GitHub repositories.

Researchers often encounter significant challenges in selecting appropriate datasets due to the lack of insight into how these datasets are being utilized. To address this challenge, this paper proposes establishing a connection between Open Source datasets and the citation networks of their corresponding academic papers. By constructing the citation network of the corresponding academic paper, we can obtain rich graph-structured information, such as citation times, authors, and more. Utilizing this information, we can evaluate the long-term usage patterns of the associated Open Source dataset.

Furthermore, this paper conducts extensive experiments based on ten major dataset categories (Texts, Images, Videos, Audio, Medical, 3D, Graphs, Time Series, Tabular, and Speech) to demonstrate that the proposed method effectively evaluates the long-term usage patterns of Open Source datasets. Additionally, the insights gained from the experimental results can serve as a valuable reference for future researchers in selecting appropriate datasets for their work.

title: A Benchmark Dataset and Evaluation of Collaboration Network in Open Source Software Community

abstract: The collaborative network within the open source software community plays a crucial role in motivating developers and ensuring the community's sustainable development. However, current research on open source collaboration network primarily focuses on statistical indicators and there is a significant lack of standardized benchmarks and objective impact evaluations. This gap hinders a comprehensive understanding of the insights within collaboration network. To address this , we first provide a public dataset of open source collaboration network in real-world graph which captures their structural and intrinsic characteristics. In addition, we establish an benchmark through the tasks of link prediction, recommendation systems by employing the graph neural network. Finally we evaluate influence of developers and repositories from influence evaluation. The results demonstrate the effectiveness of our evaluation method and provide potential development trends in open source software community. We anticipate that our proposed benchmark and evaluation will serve as a platform for testing and comparing the potential and performance of future open source collaboration network.

X-lab2017 / open-research