LSSTDESC / cosmodc2

Python package creating the cosmoDC2 synthetic galaxy catalog for LSST-DESC
Other
7 stars 1 forks source link

number of unique halo IDs very close to number of galaxies #82

Closed yymao closed 5 years ago

yymao commented 5 years ago

@plaszczy reported in LSSTDESC/DC2-production#305 that there are almost as many unique halo IDs as the number of galaxies in cosmoDC2 1.0.

I can confirm that's the case with the following code.

import GCRCatalogs
gc = GCRCatalogs.load_catalog('cosmoDC2_v1.0_small')

halo_id = gc['halo_id']
len(np.unique(halo_id)) / len(halo_id)

which gives 0.9633. Same result in 1.1.4.

Again, I don't think this is an "issue" per say. The HOD of cosmoDC2 looks reasonable. This issue is just to make sure that this feature is expected, @aphearin?

evevkovacs commented 5 years ago

Yes, the ultra-faint galaxies that are added are all centrals, so everything is consistent. Thanks for checking.

rmandelb commented 5 years ago

@plaszczy - from the above answer that @evevkovacs gave, I would like to suggest that you may wish to carry out your consistency checks using the (small) subset of objects that would be visible in these coadds, e.g., with r<25. There I expect the result would look different since the population is not the ultra-faint synthetic one but rather the ones that would appear in the images you're looking at.

(That is, assuming these checks are aimed at understanding the population that you're seeing in the images.)

yymao commented 5 years ago

@evevkovacs thanks for confirming. Please feel free to close the issue as you see fit.

plaszczy commented 5 years ago

just to be sure. I removed the ultra-faint galaxies (but did not otherwise cut on anything). There are now ~400M distinct haloes (out of 3G objects) but the per halo statistics (1.3 in mean) still seems low to me.

|summary|             count|
+-------+------------------+
|  count|         413371635|
|   mean|1.3430078021681386|
| stddev|1.7810086119014759|
|    min|                 1|
|    max|              1339|
+-------+------------------+

Why are there so many haloes with one galaxy only (and not ultra-faint)?

yymao commented 5 years ago

@plaszczy The halo mass function is a power law and hence the halo population is dominated by low-mass halos, which usually only one or no galaxy. Since the catalog does not include empty halos, it is not surprising that the majority of halos have only one galaxy. Note that the majority that we are discussing here is orders of magnitude more than the massive halos, because the halo mass function is a power law.

If you are still worried, you can look at the number of galaxies as a function of halo mass, and also the number of halos as a function of halo mass.

plaszczy commented 5 years ago

I had a closer look at -non faint- (halo_id>0) galaxies and associated haloes. I found (on 1.1.4) 942 cases where a galxy is a singel member from a halo but is not set as "is_central". How can it be possible? Here are some examples:

|      halo_id|halo_members|is_central|
+-------------+------------+----------+
|  52300180203|           1|     false|
| 180600169171|           1|     false|
| 240500157241|           1|     false|
| 353000129163|           1|     false|
| 678000127180|           1|     false|
| 957300181315|           1|     false|
|1023600113137|           1|     false|
|1277400181189|           1|     false|
|1940600113219|           1|     false|
|2054300169219|           1|     false|
|2144700112194|           1|     false|
|2272300155144|           1|     false|
yymao commented 5 years ago

What are the stellar masses of these galaxies and the halo masses of their host halos? Are these galaxies sitting close to the spatial boundaries of the catalog?

evevkovacs commented 5 years ago

It is possible for a halo to have a single galaxy that is not a "central". As you can see from the statistics it is a rare event. The central galaxy is disrupted, possibly by tidal forces. In a catalog that is the size of cosmoDC2, one would expect to see a small percentage of such anomalies. Thanks for checking.

aphearin commented 5 years ago

Thanks for continuing to pore over the catalog for anomalies, @plaszczy. @evevkovacs is correct that there is nothing about halo occupation statistics that strictly forbids this, and so small numbers of such halos are expected in a large sample. But I also think @yymao is right that it would be useful to check the distribution of stellar and/or halo masses of such galaxies.

plaszczy commented 5 years ago

OK that's interesting. I can hardly look at masses because this is obtained from a bit heavy process (transforming to parquet format) and I have not saved these variables. @yymao (or anyone) may want have a quick look (with GCR) at some of these masses, why I gave a list of halo_id's.

yymao commented 5 years ago

You can find the distributions of host halo mass, stellar mass, and spatial location of these single-member noncentral galaxies in this notebook:

https://nbviewer.jupyter.org/urls/pastebin.com/raw/1E1nyCSn

Note that the fraction of such galaxies is extremely small. The distribution plots are normalized to highlight the distributions. As expected, most come from low-mass halos.

plaszczy commented 5 years ago

OK that's make sense. I just wanted to be sure there was nothing algorithmic that could have missed the right is_central affectation (but I even don't know how it is assigned) . So If you are sure now that's fine.