Shakesbeery / vigipy

A medical device/pharmacovigilance library written in python
GNU General Public License v3.0
11 stars 3 forks source link

Code create out where there's no data #4

Closed ylchang closed 2 years ago

ylchang commented 2 years ago

Hi,

I've applied vigils BCPNN with a dataset which contains data for various devices cross several quarters-years. Among them a few devices' data are censored before the end of the entire quarter-year range covered by this data set. For example, the entire dataset cover data from 2000Q1 through 2020Q4, but product X in this dataset might only have data from 2013Q2 through 2018Q1. When I ran BCPNN, I noticed that the Count of the event, for this product X for the argument sake, would accumulate up to 2018Q1, and then the Count remain the same through 2020Q4. The signal 2.5% quantile of the IC is calculated through the end of 2020Q4, even though there's no any data for product X between 2018Q1 and 2020Q4. So the result is quite misleading. Maybe I misused the code? Or maybe I misinterpret data result?

Thank you very much for your time and help.

Regards,

Shakesbeery commented 2 years ago

@ylchang - I think I see what you're asking, but let me clarify.

Currently vigipy assumes that during periods of no reports that this may be due to either:

  1. A good safety profile (i.e. no reports because there are no reported AEs)
  2. Periodic voluntary reporting drops (i.e. a company accumulates reports and then dumps a large amount each quarter/year/etc)

For these cases, the cumulative measure is still carried forward because we expect that reports may begin again at any time; however, your use case wants a more nuanced approach. During periods of no reporting your preference would be to completely drop all reports for products that have not had AEs recorded. Is this correct? For example, product X would have signals generated for 2013-2018Q1, but in 2018Q2 there are no reports, so product X is completely dropped from consideration. Later, in 2024Q3, product X has another report, so all cumulative data is once again processed and a new signal is generated. Is that the logical flow you were thinking about?

An alternative is for products that are discontinued, we may not want to use them in our calculations anymore, but I think that becomes a data cleansing issue prior to analysis.

If I have the wrong idea maybe you could describe more about the expected output pattern you would want to see.

ylchang commented 2 years ago

This image gives the count of a binary data for various devices in the timeframe shown by the x-axis: image

As you can see, Riata, Riata ST and Sprint Fidelis have no data after certain quarters. However, the BCPNN result shows that their signals go even more significant after their data been suspended. image

The effective of suspension is not shown here. So how should I interpret this result? Shouldn't the DPA show that there's a drop of the signal after the data are suspended? I am confused...

Looks like this can be my misunderstanding of the DPA rather than a code issue. Should we continue this discussion here or maybe use another venue?

Thank you so much for your further help.

Shakesbeery commented 2 years ago

Yes, this appears to be more of an interpretation question and I would say that it's behaving as expected.

Please feel free to reach out to me over email and I'd be happy to discuss further.

ylchang commented 2 years ago

Hi, David,

Thank you very much for your support.

I think I figured it out. For my case I should not be using the LM. After I divided my data into subgroups by quarter-year, I got the result that better resemble the result from prior work. With that observation, I closed the issue on Github.

Thank you again for your precious time and guidance.

Regards,

Yu-Li

On Wed, Mar 9, 2022 at 11:13 PM David Beery @.***> wrote:

Yes, this appears to be more of an interpretation question and I would say that it's behaving as expected.

Please feel free to reach out to me over email and I'd be happy to discuss further.

— Reply to this email directly, view it on GitHub https://github.com/Shakesbeery/vigipy/issues/4#issuecomment-1063672408, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACF4HNYDIXIIWTFNSBHH6QLU7GAHJANCNFSM5QKSLOHA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

-- Yu-Li Chang PhD, CRE, CQE https://www.linkedin.com/in/yulichang 大事难事,看担当;顺境逆境,看胸襟;是喜是悲,看涵养;有舍有得,看智慧;是成是败,看坚持。

Shakesbeery commented 2 years ago

@ylchang - FYI: I have an experimental branch called disjoint-model. It now exposes the function run_disjoint() in the longitudinal model which uses only the resampled window instead of the cumulative reports. I think this is what you were originally looking for--if so let me know if it works as expected.