BioinfoNet / Data-mining

Data mining to discover trends in Open Science in Kenya
5 stars 14 forks source link

Pre-print data from prepub #2

Open kipkurui opened 6 years ago

kipkurui commented 6 years ago

Data from various pre-prints have been collated by prepub. We could use these data for our analysis. We can answer the question: Who is driving the adoption of pre-prints in Kenya? Is it the Kenyan Authors or outside collaborators?

You can download the data from the above using:

wget https://github.com/OmnesRes/prepub/blob/master/biorxiv/biorxiv_licenses.tsv

These guys used the same data to understand the licensing choice by authors in biorxiv.

kipkurui commented 6 years ago

Hi @silviane-m, did you make progress on this? I'll find some time to help out on it this week.

Silviane-m commented 6 years ago

Dear Caleb, I hope you have been doing well,

Sorry I took forever, been stranded on how to processes the information that I have...

The question I have been addressing is "Who drives the adaptation of pre-print, whether it's authors based or affiliation based" I have explored BioRXiv and Sabinet journals, in these sources there are some foreign authors affiliated with Kenyan institutions or even foreign authors writing articles about Kenyan issues but majority of the articles are written by Kenyan authors.

Now the challenges has been what to do with the articles, whether to retrieve them and even the ones I have retrieved am just stuck with them.

Cheers!

On Sat, Oct 6, 2018 at 7:12 AM Caleb Kipkurui notifications@github.com wrote:

Hi @Silviane-m https://github.com/Silviane-m, did you make progress on this? I'll find some time to help out on it this week.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BioinfoNet/Data-mining/issues/2#issuecomment-427544319, or mute the thread https://github.com/notifications/unsubscribe-auth/An3xp8FgCHLXxM7btu7uCjGR8B17ZKXEks5uiC3CgaJpZM4WUumY .

-- Silviane Miruka, MSc. Bioinformatics, Center for Biotechnology and Bioinformatics. +254723654724 @m_silviane

kipkurui commented 6 years ago

Dear Caleb, I hope you have been doing well, Sorry I took forever, been stranded on how to processes the information that I have... The question I have been addressing is "Who drives the adaptation of pre-print, whether it's authors based or affiliation based" I have explored BioRXiv and Sabinet journals, in these sources there are some foreign authors affiliated with Kenyan institutions or even foreign authors writing articles about Kenyan issues but majority of the articles are written by Kenyan authors. Now the challenges has been what to do with the articles, whether to retrieve them and even the ones I have retrieved am just stuck with them. Cheers! On Sat, Oct 6, 2018 at 7:12 AM Caleb Kipkurui @.***> wrote: Hi @Silviane-m https://github.com/Silviane-m, did you make progress on this? I'll find some time to help out on it this week. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/An3xp8FgCHLXxM7btu7uCjGR8B17ZKXEks5uiC3CgaJpZM4WUumY . -- Silviane Miruka, MSc. Bioinformatics, Center for Biotechnology and Bioinformatics. +254723654724 @m_silviane

Hi @Silviane-m, that is alright. Let me have a look at what we have work on a roadmap to addressing this question. Let's keep learning 👍

kipkurui commented 6 years ago

I made some progress on this @Silviane-m . See the data visualization notebook

Silviane-m commented 6 years ago

Dear Caleb,

I hope you are keeping well.

This is well noted and thanks for giving me a direction.

Cheers!

On 23 Oct 2018, at 18:20, Caleb Kipkurui notifications@github.com wrote:

I made some progress on this @Silviane-m . See the data visualization notebook

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

kipkurui commented 4 years ago

We can use the API or the data dump from: https://www.rxivist.org/docs

wanjauk commented 4 years ago

I'm currently attempting to use the data dump that's pre-loaded in Docker and the instructions here to see whether we can obtain up-to-date data (which includes 2019 & 2020) about pre-prints.

We may also have an issue with using biorxiv licenses data in data processing to obtain the number of Biorxiv pre-prints per country. Looking at the affiliations column, some of the affiliations are missing the country name. Hence we may miss out on assigning the majority of the pre-prints to countries because we are assigning Pre-prints to countries based on country name in the affiliations.