progress bar after run.run() command does not work

bmamrutha commented 7 months ago

The progress bar after run.run() command does not update.

There are no errors displayed since the module runs into an endless loop. Does anyone know how to resolve this issue? Attached a screenshot of the progress bar which does not update itself although it is 80% complete.

TrystanScottLambert commented 7 months ago

You're just running the example and its not completing? Or does the program eventually finish?

bmamrutha commented 7 months ago

You're just running the example and its not completing? Or does the program eventually finish?

The input file I am using, is not the example file. It is another cluster catalogue. No, the program does not finish.

TrystanScottLambert commented 7 months ago

Maybe you could share the entire script? I'm also not entirely sure how the loading bars work in notebook environments.

I suggest you try a couple things:

1) run the program with 5 trials only instead of 10 and see if that manages to finish. 2) Try running everying as a python script instead of using a notebook.

TrystanScottLambert commented 7 months ago

@bmamrutha What would also be very helpful is if you ran this for a small number of trials with small linking lengths and see if that works and then run a small number of trials with larger linking lengths. That way we would be to narrow down whether its hanging because of the number of trials or something more to do with the size of the eventual linking lengths.

So you'd want to do something like this:

d0_initial = 0.3, d0_final = 0.5, n_trials = 5

See if that works then try:

d0_initial = 0.5, d0_final = 0.9, n_trials = 5

Report back on what happened. Whilst this is written to be as generalizable as possible it really all comes down to the linking lengths that you choose which will be very depenedent on the survey, and I'm wondering if for your particular survey that maybe the distances between the points are larger than what we tested with and maybe there's a hidden infinite loop somewhere. Alternatively there might just be something wrong with the rich.track package. If you run the tests like I suggest I think we might be able to narrow down the issue.

Let me know how it goes.

bmamrutha commented 7 months ago

Maybe you could share the entire script? I'm also not entirely sure how the loading bars work in notebook environments.

Attached is the whole script: python script (changed the extension from test_fof.py to test_fof.txt so that I could upload here) and ipynb (changed the extension from testing_fof_notebook.ipynb to testing_fof_notebook.txt so that I could upload here).

testing_fof_notebook.txt

test_fof.txt

I suggest you try a couple things:

run the program with 5 trials only instead of 10 and see if that manages to finish.

It does not finish even with 5 trials; screenshot attached.

Try running everying as a python script instead of using a notebook.

When I run it as a python script, it does not finish.

I can leave it for some more time and check if it finishes after that.

bmamrutha commented 7 months ago

Hi, thank you for suggesting. I have narrowed the linking lengths as suggested by you. Here are screenshots of the two trials;

d0_initial = 0.3, d0_final = 0.5, n_trials = 5

d0_initial = 0.5, d0_final = 0.9, n_trials = 5

The run still does not finish.

I tried this script for another cluster catalogue and this finishes well. Here is the screenshot attached for the second cluster:

TrystanScottLambert commented 7 months ago

Then the problem seems to be the data set. Would you be willing to share the raw input file? Or could you tell me what the difference is between the two catalogs? What are the columns in both cases?

Its an intersting bug that would be worth investigating further.

TrystanScottLambert commented 7 months ago

@bmamrutha, I've noticed that you're manually reading in the data using astropy.Tables. Could you try reading in the data using the data handling routines provided with PyFoF?

from pyFoF.data_handling import read_data
INFILE = 'catalog.dat'
data = read_data(INFILE)

Alot of the hangups we have are dealt with during this reading in process.

bmamrutha commented 7 months ago

Then the problem seems to be the data set. Would you be willing to share the raw input file? Or could you tell me what the difference is between the two catalogs? What are the columns in both cases?

Its an intersting bug that would be worth investigating further.

The two catalogues are for two different galaxy clusters. The columns used are same for both. I use r_magnitude and photometric_redshifts in both the cases. Sadly, I cannot share the catalogues because it is not public.

bmamrutha commented 7 months ago

@bmamrutha, I've noticed that you're manually reading in the data using astropy.Tables. Could you try reading in the data using the data handling routines provided with PyFoF?
from pyFoF.data_handling import read_data
INFILE = 'catalog.dat'
data = read_data(INFILE)
Alot of the hangups we have are dealt with during this reading in process.

Even after changing the command lines from:

from pyFoF.data_handling import read_data
INFILE = 'catalogue.fits'
from astropy.table import Table
dat = Table.read(INFILE, format='fits')
dat['RA_J2000'].name = 'ra'
dat['Dec_J2000'].name = 'dec'
data = dat.to_pandas()
data

to:

from pyFoF.data_handling import read_data
INFILE = 'catalogue.fits'
data = read_data(INFILE)

The run does not terminate.

TrystanScottLambert commented 7 months ago

How many sources do you have in each catalog?

bmamrutha commented 7 months ago

Catalogue 1 has: (which works) 6836 sources. Catalogue 2 has: (which does not work) 3184 sources.

TrystanScottLambert commented 7 months ago

OK. Its a little difficult to diagnose anything without a minimal reproducible example that we could run. Of the top of my head I can't think of why one catalog works and the other doesn't.

I think the best course of action would be to change the coordinates of your sources and hide any identifying information. Then you could share the catalog so that we could reproduce the error and solve it. In any case, it might be useful to change the coordinates of your targets anyways and see if the code will run.

Maybe you'd also be willing to share say the first two lines of each catalog? Then at least we could try and see what the differences are that might cause this.

But like I say. Very difficult to do anything without a minimal reproducible example. Please consider constructing one.

@BrutishGuy any idea why the loading bars are getting stuck like this?

bmamrutha commented 7 months ago

Okay, I will try and reproduce one catalogue for you to verify. Thank you for suggesting.

bmamrutha commented 7 months ago

@TrystanScottLambert, Here, I attach the reproduced catalogues. Even with these random coordinates, Catalogue_1.fits completes the run while Catalogue_2.fits does not stop. I have 4 columns in each; ra, dec, rmag and z_photometric.

Catalogue_1.csv

Catalogue_2.csv

TrystanScottLambert commented 7 months ago

OK. I've found your problem. You have some negative photz values. In particular -99 which traditionally means that that field is empty. I.e., there isn't a photz value. After I removed those values then the program worked fine.

The negative values would then be passed to a log function, and would be larger than any of the other velocites therefore forcing resulting in that calculation to fail. But the code only stops once all galaxies have been looked at, so the negative galaxies would just never resolve and that caused the infinite loop.

In future, just make sure that your data is cleaned with appropriate values and it should be fine. The program just didn't know how to deal with negative distances.

On our part we could think about adding a warning and some kind of data validation checks so people can be warned that their data isn't correct.

Here is the code if you want to run it yourself. Just make sure to remove the negative values from you catalog before running. They're at line 222, 1369, 1372, and 1378

import pandas as pd
from astropy.cosmology import FlatLambdaCDM
from pyFoF.experiment import Experiment
from pyFoF.survey import Survey

#Choose a cosmology
cosmo = FlatLambdaCDM(H0=70, Om0=0.3)

#Read in the data
data = pd.read_csv('Catalogue_2.csv')

#Set up Survey object
test_survey = Survey(data, cosmo, apparent_mag_limit=17.6,
 alpha=-1.02, m_star=-24.2, phi_star=0.0108)

test_survey.convert_z_into_cz('ZPHOT_MEDIAN')
test_survey.make_mag_colum('RDERED')

#Run group finding algorithm
run = Experiment(d0_initial=0.3, d0_final=0.8, v0_initial=100,
    v0_final=500, d_max=2, v_max=1000,
    n_trials=10, cutoff=0.5, survey=test_survey)
run.run()

bmamrutha commented 7 months ago

Oh! My bad! Thank you @TrystanScottLambert for helping me resolve the issue!

TrystanScottLambert commented 7 months ago

You're welcome. If you do end up using the pyFoF for a publication please let us know and cite Lambert et. al., (2020).

bmamrutha commented 7 months ago

Surely will cite your work!

TrystanScottLambert / pyFoF

progress bar after run.run() command does not work #73