krassowski / multi-omics-state-of-the-field

Analyses for "State of the field in multi-omics research: from computational needs to data mining and sharing"
https://doi.org/10.3389/fgene.2020.610798
MIT License
24 stars 13 forks source link

Normalize the timeline plot against the number of articles published in the journals we consider #12

Closed krassowski closed 4 years ago

krassowski commented 4 years ago

To avoid the impression that the field raises faster than it does ;)

vd4mmind commented 4 years ago

Hi Mike, If I understand correctly sifting through the notebooks timeline is 2005-2020. Is that correct? In that case, the plot seems fine. Can you point me to the notebook where you have the latest plot? I want to see if the normalization is by n=15 or median distribution of hits per journal per year?

krassowski commented 4 years ago

Yes, we have papers from 2005-2020. It is not normalized yet - I opened the issue to address this later on (maybe tonight).

krassowski commented 4 years ago

The normalization does not change too much. I only retrieved the total number of indexed articles in PubMed for journals in which there were at least 3 multi-omics articles published (75% articles covered) but it still took 2 hours to download.

Count based: image

Adjusted counts ("pseudo-frequency", pseudo as the denominator is not - for practical reasons - not encompassing all journals) image

The uptick in 2020 is partially because of the one week delay (I have not re-run the search given its not a priority now).

krassowski commented 4 years ago

In this case, using the absolute counts might be ok - what do you think?

vd4mmind commented 4 years ago

@krassowski

The absolute count here looks fine and also okay to me. We can just term use as overall distribution of the terms past 15 years across varied journals indexed in pubmed using divergent terms to represent multi-Omics. I don’t think we have to normalize here. I like that 2020 isn’t complete yet, so there is a dip. If reviewer asks then we can always put the adjusted counts.

vd4mmind commented 4 years ago

No point of rerun and download. We can just mention our monthly window till 2020 for clarity.

krassowski commented 4 years ago

Also tried to use a yearly fraction instead to represent changes in trends but the "noise" (single articles before the year 2010) gets promoted to what seems to be major shifts, so I think it's more confusing the reader than helping:

image

krassowski commented 4 years ago

But I think we can ignore 2002 (1 match) and 2004 (2 matches) and start in 2005 (10 matches) - the recent changes are certainly more interesting.

vd4mmind commented 4 years ago

Yeah. This is indeed noisy and also not very clear. For me the first one without adjustment is still good to go

vd4mmind commented 4 years ago

Ah yes. Let’s keep the window starting 2005-2020. That is better.

krassowski commented 4 years ago

Just to quickly show what I am optimizing for (is space):

image

And this is before inclusion of the flow diagram and a fourth panel (disease)

vd4mmind commented 4 years ago

This looks pretty good & very impressive @krassowski . It’s getting there. I really like it. 😃

biswapriyamisra commented 4 years ago

Mike:

Great job indeed:

[1] I would need a "overview short caption" and a "semi-descriptive caption" for the entire Figure. If you want to have 2/3 Figures (Each figure with multiple - typically 4 panels) and fine too.

[2] Please also send me the final 4-5 sentence "methods" description for your analysis and link to Git page that will be eventually shared with publication online.

Let me know if there are any more questions and we will be done with these.

Thanks a lot, Biswa

On Sun, Aug 2, 2020 at 10:42 AM ivivek87 notifications@github.com wrote:

This looks pretty good & very impressive @krassowski https://github.com/krassowski . It’s getting there. I really like it. 😃

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/krassowski/multi-omics-state-of-the-art/issues/12#issuecomment-667628792, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGUCRIH6D7NILZODBGG4JEDR6TYSLANCNFSM4PFE6RBA .

krassowski commented 4 years ago

Done in ef1465d4a6c1f6d1dc054200d6d858c22c7feb35