pmelchior / pygmmis

Gaussian mixture model for incomplete (missing or truncated) and noisy data
MIT License
98 stars 22 forks source link

Troubled by 1 instance of np.flatnonzero #21

Open wgandler opened 2 years ago

wgandler commented 2 years ago

Hi,

I am troubled by 1 instance of np.flatnonzero. In _ Esum the lines: if U_k is None: U_k = np.flatnonzero(indices) Removing 0 index makes U_k length 1 less than chi2 length, T_inv_k length, and log_p[k] length. I don't see why the zero index is removed.

                                     Sincerely,

                                 William Gandler
pmelchior commented 2 years ago

Hello William. Are you asking about why that selection U_k = np.flatnonzero(indices) is done, or do you have trouble that is caused by this line?

wgandler commented 1 year ago

Dear Peter Melchior,

Thank you to replying to my concern. Specifically, I am porting your code to Java for use in the MIPAV medical imaging software at NIH and I noticed that when a zero index was present the line U_k = np.flatnonzero(indices) produced a U_k whose length was 1 less than the chi2 length, T_inv_k length, and log_p[k] length created in the same _ESum module. Also, I could not understand why the zero index would be excluded.

                                                  Sincerely,

                                               William Gandler

From: Peter Melchior @.> Sent: Monday, June 6, 2022 11:28 AM To: pmelchior/pygmmis @.> Cc: Gandler, William (NIH/CIT) [E] @.>; Author @.> Subject: [EXTERNAL] Re: [pmelchior/pygmmis] Troubled by 1 instance of np.flatnonzero (Issue #21)

Hello William. Are you asking about why that selection U_k = np.flatnonzero(indices) is done, or do you have trouble that is caused by this line?

— Reply to this email directly, view it on GitHubhttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpmelchior%2Fpygmmis%2Fissues%2F21%23issuecomment-1147579488&data=05%7C01%7Cilb%40mail.nih.gov%7Cb7935f4637b640617c7c08da47d133f8%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C637901261090493654%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=4Eyhd7h%2BtFbsqHzrBQtqrSQtrogJVVmZMuLQ9zIOG5s%3D&reserved=0, or unsubscribehttps://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAXMJ77B4GQR5L752HQSAFKDVNYKIXANCNFSM5X245B6Q&data=05%7C01%7Cilb%40mail.nih.gov%7Cb7935f4637b640617c7c08da47d133f8%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C637901261090493654%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=oojGOJsooTGQgL8lmAKMLTkxfs9557DqdGgmgQLJTAo%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and are confident the content is safe.

pmelchior commented 1 year ago

I'm not sure how this can happen. The relevant part of the code is this: https://github.com/pmelchior/pygmmis/blob/87ad02dd607896205ccde3ca668971c6dcacd026/pygmmis.py#L993-L1001

The array indices is of type bool, and the first step of setting U_k from its initial value None lists all non-zero (i.e. non-False) elements of indices.

The following code does exactly the same in a nutshell:

test = np.random.rand(10)
indices = test < 0.5
print(indices, np.flatnonzero(indices))