No ancestors found for AHF data

milanq14 commented 1 year ago

I'm working with Amiga Halo finder data, and have the following files in my directory: .log .parameter .AHF_halos .AHF_mtree Should this be enough for ytree to work correctly? Any idea on why it is sot finding any ancestors for any of the halos? I am working with HESTIA simulations, and the error I am getting is:

AttributeError Traceback (most recent call last) Input In [42], in ----> 1 my_ancestors = list(a.ancestors) 2 print(my_ancestors)

AttributeError: 'AHFArbor' object has no attribute 'ancestors'

Thanks!

brittonsmith commented 1 year ago

Hi there,

If you were able to load the arbor, then all the necessary files should be there. However, from the code you posted, it looks like you need to first select a tree from the arbor before accessing its ancestors. Something like this:

tree = a[0]
print (list(tree.ancestors))

See here for more information. Please, let me know if this fixes your issue.

milanq14 commented 1 year ago

Hi!

Thanks, yes I was doing it wrong, the error doesn't rise now. However, I've noticed that all my trees are empty. For instance, doing

a = ytree.load("my_route/HESTIA_100Mpc_4096_01_12.127.parameter", hubble_constant=0.677) fn=a.save_arbor() a=ytree.load(fn) my_ancestors = list(a[4].ancestors) print(my_ancestors)

gives an empy list : [ ]

I have uploaded the files I am using in: https://drive.google.com/drive/folders/1hOWycsBljNFNwNJjmpcXZ_KVaiJBcpWX?usp=share_link

Could this be related to the mtree data structure of the simulations I am using?

Thanks again for your help!

brittonsmith commented 1 year ago

Hi @milanq14, my sincere apologies for dropping the ball on this. I was buried by teaching and forgot about it. Is this still an issue for you? If so, would you mind sharing your data again? I can take a look now.

robmost commented 7 months ago

Hi,

I am sorry to reuse an old open issue but I happen to have the same problem, and I was wondering if it was ever addressed. I followed the same steps, and I'm unable to get the merger tree information processed by ytree.

My mtrees use the old .AHF_mtree_idx format in which for N snapshots you get N-1 files that linked consecutive snapshot. However, they are allowed to have snapshot skipping, i.e. a halo progenitor not found at the current snapshot is still being searched for in all the remaining snapshots, and if found, it is appended at the end of the haloes linked at the current snapshot. Here is how the file would look if linking snapshot 18 with snapshot 17 haloes:

# snap018 snap017 180000090000000002 170000090000000002 180000090000000026 170000090000000020 180000090000000031 170000090000000019 200000090000000076 170000090000000016 220000040000000236 170000040000000028

The IDs follow the formula: snapshot 1e16 + mpi_rank 1e10 + (halo_id + 1), so the first two digits of the IDs tell the snapshot at which AHF found them.

I don't know if that is relevant to the problem, but I suspect it would need to be taken into account in I/O.

Cheers

Hi!

Thanks, yes I was doing it wrong, the error doesn't rise now. However, I've noticed that all my trees are empty. For instance, doing

a = ytree.load("my_route/HESTIA_100Mpc_4096_01_12.127.parameter", hubble_constant=0.677) fn=a.save_arbor() a=ytree.load(fn) my_ancestors = list(a[4].ancestors) print(my_ancestors)

gives an empy list : [ ]

I have uploaded the files I am using in: https://drive.google.com/drive/folders/1hOWycsBljNFNwNJjmpcXZ_KVaiJBcpWX?usp=share_link

Could this be related to the mtree data structure of the simulations I am using?

Thanks again for your help!

milanq14 commented 7 months ago

Hi,

First of all, my apologies Britton, I finally adressed my study in a different manner, without using ytree and I completely forgot about. Second o all, no Robert, I was not able to adress the issue, and I would be happy to look at the data again. I suspect you are looking at HESTIA simulations from the mtree format you show?

brittonsmith commented 5 months ago

@robmost, apologies for taking so long to response. I am now starting to come out of my teaching-related hibernation. I'd be willing to look at this as time permits. Do you have some sample data that I could work with?

milanq14 commented 5 months ago

Hi there:

I have uploded the data for two successive snapshots here:

https://drive.google.com/drive/folders/1hOWycsBljNFNwNJjmpcXZ_KVaiJBcpWX?usp=drive_link

I hope this helps.

brittonsmith commented 1 month ago

Hi everyone,

I'm not sure this is relevant to anyone anymore, but I've finally had some time to look at this issue using the data provided by @milanq14. The mtree files are in a different format than what ytree currently supports. I have done something that makes things work in PR #168, but ignores the fact that this format stores the unique IDs. With the changes in the PR, you can load the data, query fields, etc, but you'll notice that the first halo has ID of 0 and not, for example 127000000000001. If there is interest here, then I'd be happy to put some work into this.

@robmost, I do not have any sample data that resembles the format you describe. It would take some moderate modifications to make it work, but it's likely something I could do this summer if there is still interest. If this is still relevant to you, could you please provide me a complete, but reasonably small (say < 1 GB) dataset to use for testing?

robmost commented 1 month ago

Hi everyone,

I managed to do the analysis I needed, but unfortunately, I had to give up on using ytree as it could not handle the format of my merger trees. That said, I think it would be beneficial to have support for it since newer versions of AHF and MergerTree adopt this format, which is very convenient when dealing with snapshot skipping and MPI and ensures unique IDs across the whole simulation.

@brittonsmith I have uploaded the AHF 1.0-116 + MergerTree 1.2 output for two snapshots from my simulations here

Note that both AHF and MergerTree have been run with MPI support. Due to how AHF runs, it outputs the results for each process (10 MPI processes per snapshot). However, I postprocessed the AHF_halos files to have one single file instead of one per process, which is easier for me to work with, whereas the particle files are kept separate since MergerTree can handle that. MergerTree, on the other hand, after reading the particle files of each MPI process, it only outputs one file per snapshot.

Thanks for taking the time to look into this!

brittonsmith commented 1 month ago

@robmost, thanks for the data. I think I should be able to get this format supported by the end of the summer. Would it be possible to send the data with the individual AHF_halos files for each MPI process? I'd like to be able to support the form of the data most likely to be encountered any given user.

robmost commented 4 weeks ago

Hi @brittonsmith, sorry for the late reply. Here is the data with the indivial AHF_halos files corresponding to each MPI process: link

brittonsmith commented 3 weeks ago

@robmost, no worries, thanks for these. I'll try to make this happen!

ytree-project / ytree

No ancestors found for AHF data #157