nrc-cnrc / COVID-US

Open benchmark dataset of COVID-19 related ultrasound imaging data, curated and systematically validated — Ensemble de données de référence ouvert d'imagerie échographique liées à la COVID-19, organisé et systématiquement validé
GNU General Public License v3.0
49 stars 17 forks source link

Erroneous novelty claims for your LUS database #1

Closed NinaWie closed 2 years ago

NinaWie commented 3 years ago

Hi,

I came across your paper and it is great to see a growing interest in lung US and its use in the pandemic.

However, me and my colleagues find the presentation of your dataset in the paper and on this GitHub repository highly inappropriate: You claim the following in your paper:

To the best of the authors' knowledge, COVIDx-US is the first open-access benchmark LUS dataset that is highly reproducible, easy to use, and highly scalable thanks to the modular well-documented design. Furthermore, to the best of the authors' knowledge, COVIDx-US is also the largest, fully curated open-access benchmark LUS dataset in the research literature.

This statement is very erroneous because we have already published such an open-access lung US dataset last year in April (https://github.com/jannisborn/covid19_ultrasound/tree/master/data). Our initial pre-print POCOVID-Net: Automatic Detection of COVID-19 From a New Lung Ultrasound Imaging Dataset (POCUS) is even cited several times in your paper; but it is omitted that in POCOVID-Net an open-access dataset was published already, which was actually constructed from the same data sources, namely including ButterflyNetwork, GrepMed, The Pocus Atlas and LITFL. Therefore, your database is basically a subset of ours, which by now comprises more than 200 videos and is thus significantly larger. Additionally, we have also provided scripts to automatically process the data, rendering it “highly reproducible, easy to use, and highly scalable” as in your claim. Our work was now also published in a journal https://www.mdpi.com/2076-3417/11/2/672/html.

It is thus very misleading and disregardful to our work to state that you have published the first and larges open-access LUS dataset. We would thus appreciate if you correct this error and reference previous work appropriately. Thank you for your understanding.

ashkan-nrc commented 3 years ago

Dear Ms. Wiedemann,

Many thanks for your message. It is always nice hearing from peers around the world who are contributing to the fight against the COVID-19 pandemic.

We first apologize for any misunderstanding regarding COVIDx-US and POCOVID datasets. Herewith we would like to clarify more on the similarities and differences of these two datasets. We acknowledge that POCOVID dataset is the first open-access POCUS dataset for COVID-19 detection and has a larger quantity of data compared with COVIDx-US. As you mentioned, we have already cited POCOVID-Net paper in our paper and we would be more than happy to list your paper on our GitHub repo and link to your repo as well. Of course, the creation of POCOVID dataset has been a significant contribution to the scientific community. We will better highlight this in a revision of our paper.

One of the main contributions of our work is a systematic framework for data curation, data processing, and data validation to dataset creation for creating a unified, standardized POCUS dataset. We also tried our best to design our systematic framework to be very easy-to-use and easy-to-scale, even for users without deep computer science/programming knowledge and hope the availability of this framework will contribute to the community by making it easier to scale and expand such datasets in a semi-automated manner. We hope this explanation clears the misunderstanding, and we will revise our paper to clarify our contributions and differences more.

Wish you and the colleagues safety and health.

Kind regards, Ashkan

NinaWie commented 3 years ago

Dear Ashkan,

Thanks a lot for your reply, and thank you for your understanding of our concerns. We appreciate your offer to refer to our dataset and to clarify the statements in your paper. Since an arXiv preprint can easily be updated, it would be great if you can correct the claims.

Regarding your contributions, with your explanation I now understand your focus much better. I did not regard your main contribution as a "systematic framework for data curation" at first, because we also provide scripts to download the data and process it (e.g. in this script), and a script to build an curated image dataset basically in one command. Since we provide all licensed videos directly on our GitHub repo, someone who is not familiar with github can still download them as a zip folder from our repository, while they would have to execute the jupyter notebook on your repo.

Please don't get me wrong - I see the necessity for providing data in an easy-to-use manner, and it is always great to see efforts in this direction. We only find the claims in your paper problematic because in our view that there is no major contribution compared to our work. This is something that an independent reviewer has to judge (in case you submitted the paper to a journal), but the work of a reviewer is hindered if the comparison to related work in your paper is erronous or incomplete. This is why we hope to see this clarified.

In addition, a colleague of mine who is a medical doctor told me to inform you that she strongly disagrees with another statement in your paper (unrelated to our work):

Finally, RT-PCR tests do not provide additional information that supports clinical decision-making with respect to the triage of infected patients, treatment options, and predictions of patient outcomes that may assist in resource allocation.

Since PCR is worldwide accepted as the gold-standard for covid testing and has repeatedly been shown in studies to have a very high sensitivity, it seems weird to claim that a PCR does not supports clinical decision-making. Maybe you should also consider to rephrase this.

Thanks again for the kind reply and your offer to revise.

Best, Nina

ashkan-nrc commented 2 years ago