cherab / core

The core source repository for the Cherab project.
https://www.cherab.info
Other
45 stars 24 forks source link

Default atomic data distributed with Cherab #364

Open vsnever opened 2 years ago

vsnever commented 2 years ago

While discussing where to store the Gaunt factor for Bremsstrahlung emission model in #352, @Mateasek proposed to create a subpackage cherab.default_data to store the default atomic data distributed with Cherab.

This was agreed, but the question remained whether the third party atomic data sources, like OpenADAS, should inherit from the core AtomicData interface or from the DefaultAtomicData interface.

I think that the third party data sources should inherit from the DefaultAtomicData, because the DefaultAtomicData may contain the data not present in the third party data source. For example, DefaultAtomicData has the Gaunt factor needed for bremsstrahlung, but lacks atomic rates, while OpenADAS, if inherited from the core AtomicData, will have the rates but not the Gaunt factor. Therefore, it will not be possible to simulate, for example, spectral line emission on top of the bremsstrahlung background without initializing line emission models with OpenADAS and Bremsstrahlung model with DefaultAtomicData. But the most convenient way to use atomic data is by connecting it to the Plasma object and thus using the same data source for all models.

We may run into a case when both the DefaultAtomicData and the OpenADAS contain data for the same physical quantity. For such a case we may add a parameter override_default to OpenADAS, which, if True, will use the OpenADAS data, and if False, the default data.

@Mateasek, @jacklovell, what do you think?

Mateasek commented 2 years ago

Thanks for opening an issue @vsnever. I think that creating the default_data could help in the future. It is in my opinion much better than storing any data in the core.

Another option is importing classes/functions from the default_data in the "derived" data. Combining both imports and inheritance could give us enough flexibility in the future. The main point I see is to give users a flexible and simple way how to build their own Cherab data source. Without the need of copying the actual data between repositories. For example, in future, we can also have ALADDIN data source and users could combine data from both OpenADAS, ALADDIN and default_data to form their own repository.

Here I think I would also ask for opinion of @CnlPepper and @mattngc, because this is an important decision to take.

vsnever commented 2 years ago

Another option is importing classes/functions from the default_data in the "derived" data. Combining both imports and inheritance could give us enough flexibility in the future. The main point I see is to give users a flexible and simple way how to build their own Cherab data source. Without the need of copying the actual data between repositories. For example, in future, we can also have ALADDIN data source and users could combine data from both OpenADAS, ALADDIN and default_data to form their own repository.

Also, we can connect a list of atomic data sources instead of a single data source to Plasma. Models will iterate over data sources until they find the first one in which the required function is implemented. In this case we can inherit all atomic data sources from the core interface.

Mateasek commented 2 years ago

Also, we can connect a list of atomic data sources instead of a single data source to Plasma. Models will iterate over data sources until they find the first one in which the required function is implemented. In this case we can inherit all atomic data sources from the core interface.

This could lead to undefined behaviour. What would happen if there were more data sources with different data in the list? I can imagine the debugging would be terrible procedure, or even worse, you could end up with wrong results without even realising. I think that giving the possibility to prepare a custom source can prevent a lot of problems

vsnever commented 2 years ago

I think that giving the possibility to prepare a custom source can prevent a lot of problems.

This seems to be the correct way to solve this problem, but also complex in terms of implementation. However, the effort should pay off in the future.

CnlPepper commented 2 years ago

The way to solve this, and what Matt and I were working towards, was to give Cherab its own data repository and format. OpenADAS and any other data source would simply be used to populate the cherab repository. You can see the start of this inside the "openadas" module - the rates are "installed". You just need to expand this concept.

We never got around to doing this due to us not needing/interacting with non-openadas data. So in short, the correct approach is:

1) split the current openadas module into "atomic" and "openadas". 2) the "cherab" repository moves to atomic 3) the rate etc.. install/conversion routines stay in openadas 4) other rate/data sources simply install/convert data into an internal cherab form from now on

This approach is the most scaleable and as a nice side effect, it would introduce a new.... hopefully cleaner data atomic data representation to the community. So the community can then chip away at the garbage (representation/API wise... looking at you ADAS) that are the current data sources.

CnlPepper commented 2 years ago

I'd add default data to the cherab repo, much like the wavelength data. Users can override it as they want.

vsnever commented 2 years ago

Thank you very much, @CnlPepper, I think I got it. So, the atomic data repository created in the user folder is no longer associated with a single data source exclusively. The user can populate the repository with data from multiple sources, and since multiple repositories are allowed, switching between them gives the desired flexibility.

The current format for representing atomic data in Cherab has a strong correlation with how data is provided in ADAS, and this also affects the AtomicData interfaces. But for now we can pretend that this representation is universal and improve it in the future if necessary.