brainhacklucca /

0 stars 0 forks source link

Exploring and Exploting Behavior in Research Career Trajectories #3

Open tramaglino opened 8 months ago

tramaglino commented 8 months ago


Exploring and Exploting Behavior in Research Career Trajectories: Developing a Niche Index for Single Researchers


Lorenzo Teresi Roberto Pizziol


No response

Brainhack Global 2023 Event

Brainhack Lucca

Project Description

Title: "Development of a Niche Index for Quantifying Exploitation and Exploration Behaviors in Research Scientists Through Abstract Similarity and Co-citation Patterns"

Introduction: The landscape of scientific research is shaped by the dual forces of exploitation—deepening knowledge within established domains—and exploration—venturing into novel areas. Traditional metrics such as citation counts fail to capture the nuanced interplay of these behaviors. This proposal introduces a novel Niche Index, designed to quantify researchers' tendencies towards exploitation or exploration, thereby offering a multidimensional view of scientific contributions.


  1. To develop a Niche Index that quantifies the exploitation and exploration behaviors in research scientists.
  2. Compare the different proposed metrics, including a comparison with traditional research indicators (e.g., h-index)
  3. To create a framework that allows academic institutions to understand the research dynamics of their faculty better, informing resource allocation and strategic planning.

Methodology: The study will employ a mixed-methods approach, integrating quantitative text analysis and network analysis.

Textual Analysis:

Network Analysis:


This research is poised to:

Preliminary Steps:

Key Resources:

Conclusion: The Niche Index promises to improve the assessment of scientific research by capturing the dynamic interplay between depth and breadth in scholarly work. This metric will not only reflect the intellectual trajectory of researchers but also guide institutions in nurturing a balanced research ecosystem.

Link to project repository/sources

No response

Goals for Brainhack Global

Day 1: Foundation and Framework Development

Milestone 1: Theoretical Alignment

Deliverable: A comprehensive literature review summary to align the Niche Index with current scientometric theories. Goal: Participants with a background in scientometrics and literature review will aim to synthesize existing knowledge into a cohesive framework for the Niche Index.

Milestone 2: Data Collection Setup

Deliverable: A protocol for collecting and preprocessing abstracts and co-citation data. Goal: Participants skilled in data mining will establish the necessary pipelines for data acquisition from scientific databases.

Milestone 3: Tool Selection

Deliverable: A list of NLP and network analysis tools.

Goal: Tech-savvy participants will select and prepare the tools needed for text and network analysis, ensuring they are ready for use.

Day 2: Analysis and Development

Milestone 4: Preliminary Text Analysis

Deliverable: Initial semantic embeddings of selected abstracts.

Goal: NLP enthusiasts will apply text analysis models to generate abstract embeddings, suitable for participants with machine learning and programming skills.

Milestone 5: Network Analysis Algorithm Development

Deliverable: A draft algorithm for co-citation network analysis and clustering coefficient calculation.

Goal: Participants with experience in network analysis and algorithm development will craft the initial version of the clustering algorithm.

Milestone 6: Integration Testing

Deliverable: A report on the integration of textual and network analysis methods. Goal: Developers and system integrators will ensure that the different components of the analysis work in concert.

Day 3: Refinement and Presentation Milestone 7: Data Analysis and Index Computation

Deliverable: A computation of the Niche Index for a test set of researchers. Goal: Data scientists and statisticians will calculate the Niche Index, adjusting parameters and refining the model based on preliminary results.

Milestone 8: Comparative Metric Analysis

Deliverable: A comparison report of the Niche Index against traditional metrics. Goal: Participants with a strong understanding of scientometrics will compare the newly calculated index with existing metrics.

Milestone 9: Documentation and Dissemination

Deliverable: Complete documentation and a draft presentation of the project outcomes, which includes the creation of figures aimed to illustrate the results and give an intuition and illustration of how the index works

Goal: All participants will contribute to the final documentation, with those possessing strong writing and presentation skills leading the effort. Final Presentation: Showcasing the Niche Index

Deliverable: A comprehensive presentation of the Niche Index development process and its potential impact. Goal: To engage the entire group in presenting the work accomplished, highlighting the collaborative effort and potential future work.

Good first issues

  1. Task: read the main papers related to the background literature needed to situate and refine the development of the Niche Index: Börner et al., 2004a; Börner et al., 2004b; Leydesdorff et al., 2017; Skov 2021; Kawamura et al., 2018; Kosten 2016; Jiang et al., 2023; Park et al., 2023; Lascialfari et al., 2022; Leydesdorff et al., 2021; Packalen et al., 2017; Leydesdorff et al., 2017; Penner et al., 2013; Li et al., 2019;

  2. Task: Install required software and tools; ensure the development environment is working. Skill Development: Familiarization with project-specific tools and setup processes.

  3. Task: Compile a list of useful resources, tutorials, and guides related to the project's technological stack.

Communication channels


Onboarding documentation

No response

What will participants learn?

Participants will gain hands-on experience with NLP and network analysis, learn data curation techniques, enhance their understanding of scientometrics, and develop collaboration skills in an open-source environment. Our project is structured to ensure that newcomers are mentored

Data to use

Our project utilizes scientific abstracts and citation data sourced from databases like PubMed and Scopus or even already downloaded databases still to be found. For detailed analysis, we'll use semantic embeddings and co-citation networks derived from this data. Access to the databases may require institutional affiliation or subscription.

Number of collaborators


Credit to collaborators

New contributors will be acknowledged through a dedicated section in the project documentation and via public acknowledgments on our project's repository and community channels.


DALL·E 2023-11-04 19 35 16 - Design a logo for 'Brainhack Project'  The logo should feature a split face where one half is themed around adventure and exploration, symbolized by a


data_management, method_development

Development status



diversity_inclusivity_equality, neural_networks, other


Jupyter, other

Programming language

documentation, Python, shell_scripting, html_css



Git skills

0_no_git_skills, 1_commit_push, 2_branches_PRs

Anything else?

No response

Things to do after the project is submitted and ready to review.

tramaglino commented 8 months ago

Hi @brainhacklucca my project is ready!

tramaglino commented 8 months ago

We're crafting a Niche Index for academic exploration/exploitation – because citation counts are so last century. Join us to untangle the web of research behaviors with NLP and network analysis. It's like ecology for scientists ;). #Brainhack #NicheIndexChallenge

StanSStanman commented 8 months ago

Hi @tramaglino, your project has been successfully added to the BHL 2023 website! :tada: See you soon! Ruggero