Closed tramaglino closed 1 month ago
Hi @brainhacklucca my project is ready!
We're crafting a Niche Index for academic exploration/exploitation – because citation counts are so last century. Join us to untangle the web of research behaviors with NLP and network analysis. It's like ecology for scientists ;). #Brainhack #NicheIndexChallenge
Hi @tramaglino, your project has been successfully added to the BHL 2023 website! :tada: See you soon! Ruggero
Title
Exploring and Exploting Behavior in Research Career Trajectories: Developing a Niche Index for Single Researchers
Leaders
Lorenzo Teresi https://twitter.com/teresi_lorenzo Roberto Pizziol
Collaborators
No response
Brainhack Global 2023 Event
Brainhack Lucca
Project Description
Title: "Development of a Niche Index for Quantifying Exploitation and Exploration Behaviors in Research Scientists Through Abstract Similarity and Co-citation Patterns"
Introduction: The landscape of scientific research is shaped by the dual forces of exploitation—deepening knowledge within established domains—and exploration—venturing into novel areas. Traditional metrics such as citation counts fail to capture the nuanced interplay of these behaviors. This proposal introduces a novel Niche Index, designed to quantify researchers' tendencies towards exploitation or exploration, thereby offering a multidimensional view of scientific contributions.
Objectives:
Methodology: The study will employ a mixed-methods approach, integrating quantitative text analysis and network analysis.
Textual Analysis:
Network Analysis:
Significance:
This research is poised to:
Preliminary Steps:
Key Resources:
Conclusion: The Niche Index promises to improve the assessment of scientific research by capturing the dynamic interplay between depth and breadth in scholarly work. This metric will not only reflect the intellectual trajectory of researchers but also guide institutions in nurturing a balanced research ecosystem.
Link to project repository/sources
No response
Goals for Brainhack Global
Day 1: Foundation and Framework Development
Milestone 1: Theoretical Alignment
Deliverable: A comprehensive literature review summary to align the Niche Index with current scientometric theories. Goal: Participants with a background in scientometrics and literature review will aim to synthesize existing knowledge into a cohesive framework for the Niche Index.
Milestone 2: Data Collection Setup
Deliverable: A protocol for collecting and preprocessing abstracts and co-citation data. Goal: Participants skilled in data mining will establish the necessary pipelines for data acquisition from scientific databases.
Milestone 3: Tool Selection
Deliverable: A list of NLP and network analysis tools.
Goal: Tech-savvy participants will select and prepare the tools needed for text and network analysis, ensuring they are ready for use.
Day 2: Analysis and Development
Milestone 4: Preliminary Text Analysis
Deliverable: Initial semantic embeddings of selected abstracts.
Goal: NLP enthusiasts will apply text analysis models to generate abstract embeddings, suitable for participants with machine learning and programming skills.
Milestone 5: Network Analysis Algorithm Development
Deliverable: A draft algorithm for co-citation network analysis and clustering coefficient calculation.
Goal: Participants with experience in network analysis and algorithm development will craft the initial version of the clustering algorithm.
Milestone 6: Integration Testing
Deliverable: A report on the integration of textual and network analysis methods. Goal: Developers and system integrators will ensure that the different components of the analysis work in concert.
Day 3: Refinement and Presentation Milestone 7: Data Analysis and Index Computation
Deliverable: A computation of the Niche Index for a test set of researchers. Goal: Data scientists and statisticians will calculate the Niche Index, adjusting parameters and refining the model based on preliminary results.
Milestone 8: Comparative Metric Analysis
Deliverable: A comparison report of the Niche Index against traditional metrics. Goal: Participants with a strong understanding of scientometrics will compare the newly calculated index with existing metrics.
Milestone 9: Documentation and Dissemination
Deliverable: Complete documentation and a draft presentation of the project outcomes, which includes the creation of figures aimed to illustrate the results and give an intuition and illustration of how the index works
Goal: All participants will contribute to the final documentation, with those possessing strong writing and presentation skills leading the effort. Final Presentation: Showcasing the Niche Index
Deliverable: A comprehensive presentation of the Niche Index development process and its potential impact. Goal: To engage the entire group in presenting the work accomplished, highlighting the collaborative effort and potential future work.
Good first issues
Task: read the main papers related to the background literature needed to situate and refine the development of the Niche Index: Börner et al., 2004a; Börner et al., 2004b; Leydesdorff et al., 2017; Skov 2021; Kawamura et al., 2018; Kosten 2016; Jiang et al., 2023; Park et al., 2023; Lascialfari et al., 2022; Leydesdorff et al., 2021; Packalen et al., 2017; Leydesdorff et al., 2017; Penner et al., 2013; Li et al., 2019;
Task: Install required software and tools; ensure the development environment is working. Skill Development: Familiarization with project-specific tools and setup processes.
Task: Compile a list of useful resources, tutorials, and guides related to the project's technological stack.
Communication channels
https://app.slack.com/client/T064NF0E873/C0649NTDWSX
Skills
Onboarding documentation
No response
What will participants learn?
Participants will gain hands-on experience with NLP and network analysis, learn data curation techniques, enhance their understanding of scientometrics, and develop collaboration skills in an open-source environment. Our project is structured to ensure that newcomers are mentored
Data to use
Our project utilizes scientific abstracts and citation data sourced from databases like PubMed and Scopus or even already downloaded databases still to be found. For detailed analysis, we'll use semantic embeddings and co-citation networks derived from this data. Access to the databases may require institutional affiliation or subscription.
Number of collaborators
1-3
Credit to collaborators
New contributors will be acknowledged through a dedicated section in the project documentation and via public acknowledgments on our project's repository and community channels.
Image
Type
data_management, method_development
Development status
0_concept_no_content
Topic
diversity_inclusivity_equality, neural_networks, other
Tools
Jupyter, other
Programming language
documentation, Python, shell_scripting, html_css
Modalities
not_applicable
Git skills
0_no_git_skills, 1_commit_push, 2_branches_PRs
Anything else?
No response
Things to do after the project is submitted and ready to review.
Hi @brainhacklucca my project is ready!