rempsyc / busara_dashboard

The Missing Majority in Behavioural Science Dashboard
https://remi-theriault.com/dashboards/missing_majority
1 stars 0 forks source link

Journal selection #2

Closed psforscher closed 5 months ago

psforscher commented 8 months ago

Our selection of journals is a bit narrow, particularly in economics. We could also consider adding some explicitly African, South African, and Asian journals. Chaning Jang and Maya Ranganath had some specific suggestions for broadening our selection of economics journals

Some economics journals from Amy Shipow and Maya Ranganath:

From Chaning Jang:

More info here: https://docs.google.com/document/d/1e_cf1M1vUAzeyKCq77ClJPa5cDB-XAk60bI0cCKUOzw/edit

psforscher commented 8 months ago

More thoughts from Hans:

Although I am not a fan of JIFs, I would have selected a top 10 with highest JIFs, because that is what people will often look at (e.g., this list). To communicate your values, you can also include the journals with the top 10 highest TOP factors in psychology (or top 5, if you would go by the total list you currently have, so 10 in total). Longer term, it can be useful to include African, South American, and Asian Journals as well (many of them, despite having general names are Europe or US-based/focused).

rempsyc commented 6 months ago

In one multi-lab paper I am coauthoring, the authors list a number of economic journals, and also justify their choices. Perhaps we could add those to the list and get inspiration from their justification.

For this paper, our focus is on the following nine economic journals: (1) American Economic Review, (2) American Economic Review: Insights, (3) American Economic Journal: Applied Economics, (4) American Economic Journal: Economic Policy and (5) American Economic Journal: Macroeconomics, (6) The Economic Journal, (7) Journal of Political Economy, (8) Quarterly Journal of Economics and (9) Review of Economic Studies. For political science, our focus is on three journals: (1) American Journal of Political Science, (2) American Political Science Review and (3) Journal of Politics.

These journals were selected for multiple reasons. First, all of these journals are considered leading outlets in their respective disciplines. Second, they all have a data and code availability policy. Third, most of these journals conduct computational reproducibility checks for most accepted articles. [...] Our sample of journals should thus be seen as highly selective.

rempsyc commented 6 months ago

Here is a master list of the proposed journals and their integration which I will update from time to time:

Also included the suggestions from Chuan-Peng:

I think there are many similar journals that are not included, e.g., in developing psych, Child Development, Developmental Science are good journals too, in social psych, PSPB is also regarded as a good journal.

Note: When searching, sometimes adding "The" in front of the journal name is necessary to get the right results in PubMed. Frequently, there are no results at all without the "The".

rempsyc commented 6 months ago

Here is the general workflow for adding journals:

  1. First, we have to identify the correct journal name. The common journal name is not always the same as the one PubMed uses internally, so manual work is required to validate the correct internal name for each journal.
  2. Second, we have to make sure that we are able to fetch the data from PubMed. This can fail for example when the amount of data is too large (e.g., in the case of PloS One) or too small (e.g., some journals barely get any publication included).
  3. Third, sometimes the wrong journals come up, even using the right key words, so these need to be filtered out manually.
  4. Fourth, we have to update package pubmedDashboard to include the correct journal name by default, with the proper field categorization (psychology, economics, general, etc.).
  5. Fifth, we have to run a loop per year to redownload all new journals papers. At this step, there are frequently errors because the data becomes too large to handle, especially for large journals (e.g,. Science and Nature), which need to be processed separately in order to not break the overall workflow.
  6. Sixth, When all the pre-2020 data has been securely downloaded, then we can setup the automatic weekly download for 2020+.

So this step is one of the most time-consuming in a way. There is unfortunately a LOT of back and forth to get the journal names properly processed. Very time-consuming.

rempsyc commented 6 months ago

It would be useful to add a tally page showing number of papers per journal, so we can see which journals are overrepresented, and which are underpresented because of the PubMed Data (e.g., Collabra). See #22

rempsyc commented 6 months ago

Some lesser known journals are simply not indexed on PubMed, which is another limitation that should be discussed in #18 in terms of representativeness of the data

rempsyc commented 6 months ago

Our newly added tally allows us to see that some journals are missing some significant amount of years. It would be helpful to investigate this further (#23)

rempsyc commented 6 months ago

The quantity of new journals is getting a bit overwhelming to manage (e.g., see #26)... I wonder whether we should not be more specific in our selection. One possibility would be to think about the goal of the dashboard. Is it to show the misrepresentation in the most read journals, imperfectly captured possibly as the impact factor? Choosing the impact factor would be an objective criteria for journal selection.

Of course, it would exclude more diverse journals, but isn't it the point that those other journals are less read or popular somehow? Without using impact factor, there are litterally hundreds of journals we could go for or that I see no reason we shouldn't include in a way... But how informative are those journals for our story line? Would finding that diverse journals are also more diverse in their first authorship change our take-home message, like if the problem is only with high-impact factor journals?

Another option to better manage the number of journals would be to do one dashboard per discipline. One for psych journals only, one for economics journals only, and one for interdisciplinary journals only.

psforscher commented 6 months ago

The goal is to show how the representation of behavioral science journals changes over time and thereby serve as a nudge to encourage journals to change their practices. Behavioral science is an interdisciplinary discipline that draws on economics, psychology, and a bit of anthropology, sociology, and political science.

I definitely want to change journals. This means we could target either journals that, if they do change their practices, would be unusually influential in leading the way for other journals, and/or we could target journals that we judge are likely to be influenced by this dashboard because they already care about representation. I think there's a mix of these kinds of journals in the initial list I suggested: for example, Collabra is rather reform-minded, even if it only ranks so-so on JIF (note, of course, that prestige is not quite the same as high JIF).

Overall, I'd rather not have multiple dashboards if we can avoid it, but I'm open to it if that's the only way to manage the number of journals. I don't think we need to select every journal; we should prioritize those that are close to behavioral science (for example, social psychology or behavioral economics journals), those that are well-regarded, those that are likely to be affected by this nudge, and/or those that might make for interesting comparisons (if the African journals were on PubMed, those might be interesting).

All that said the list you compile in this comment is good/interesting and if it's easy to implement the full list in a way that's user-friendly, I'd love that -- but if you need to prioritize, I suggest the criteria listed above.

rempsyc commented 5 months ago

Alright, so I've had some breakthrough in #26 (using double quotes for journal names in the PubMed query solved a lot of issues), and was able to download data for all available journals by making repeated API calls to divide into ever smaller batches. The current dashboard shows the new data. However, there is still a problem with journals using "&" symbols internally (which is a big problem for PSPB for example, #28).

A first insight is that economics journals have WAY fewer papers available.

Second, the waffle plot by journals may not be adequate anymore given the volume of journals (#29).

Third, eventually I would like to add my own list of suggested journals, but for now I think I would better be dealing with the other opened issues first...

rempsyc commented 5 months ago

I think we can close this for now and open a new journal selection issue in the future once we get there?