OpenAPC / openapc-de

Collect and disseminate information on fee-based Open Access publishing
https://treemaps.openapc.net/
119 stars 118 forks source link

Ask contributors to provide data on all APC payments #97

Open tullney opened 8 years ago

tullney commented 8 years ago

We should aim at having the best data for analyses of APC payments. Therefore, the data should be as complete as possible. In some cases, contributors can - for pragmatic reasons - only contribute parts of their data (e.g. payments to a particular publisher) and have to postpone contributing other data. This leaves us with a bias in terms of 'biggest recipients of APC payments'. Examples: MPDL has added more and more publishers to their data (https://github.com/OpenAPC/openapc-de/commits/master/data/mpg), while for LMU München there’s just Springer/BMC data (https://github.com/OpenAPC/openapc-de/tree/master/data/lmu). See in this picture the huge impact this has on visualizing the APC recipients (German universities to publishers data with LMU being turquoise ):

publishers

While these issues might improve over time, there’s another concern I’d like to add to the picture: Institutions choosing to deliberately hold back parts of their data. See e.g. this statement of U of Leipzig (cc: @vielera):

Data on hybrid OA and APC above 2.000 EUR are not included (exceptions due to currency exchange rates). (https://github.com/OpenAPC/openapc-de/blob/0901f552ca166c5cba8e702c9e6807a443b39f19/data/unileipzig/README.md)

We all know that we can only gather parts of the APC payments – there is a huge gray area. But if the data is available to a data provider, then I’d really suggest that this data is provided to the Open APC initiative as complete as possible. There might we problems or concerns I did not see, but maybe we can at least agree on the general goal?

Maybe we can have a discussion here and/or during the workshop in April? What do you think?

vielera commented 8 years ago

I think this is a very important issue. Regarding my statement on the Leipzig University Dataset, I'd like to clarify that we do not deliberatly hold back any data, but that we simply do not have access to data regarding hybrid APC and on APC that are not covered by the fund. The information has been added for transparency and to allow a better interpretation of the dataset. Maybe it would help, if we could agree on a miminum of explanation accompanying each dataset, that clarifies the completeness?

dirkpieper commented 8 years ago

Thanks for the discussion, which I would like to continue at our workshop in Bielefeld. I would also agree to vielera that this is not a question about holding back data. For example to name the APC cost for a RSC voucher is a challenge. In another case, we discussed how to deal with invoices above the dfg price cap and how to deal with splitting the amount of money. In this case we thought its better to name the total price for the institution, which differs from the expenditure of the funds.