Question on how to properly obtain the in[active] orbitals for removal in ActiveSpaceTransformer

buttercutter commented 1 year ago

I am trying to use https://qiskit.org/documentation/nature/stubs/qiskit_nature.second_q.transformers.ActiveSpaceTransformer.html to reduce a quantum chemistry problem for a more feasible computation.

However, the class requires active_orbitals as input, and there is no explicit mention on how we could achieve orbital removal whether we feed in active_orbitals input or not. Besides, how would we come up with the active_orbitals input if we do not know which are the inactive orbitals for removal in the first place ?

The relevant code snippet within ActiveSpaceTransformer is as follows, however I am not sure how it works.

    def _determine_active_space(
        self, grouped_property: GroupedElectronicProperty
    ) -> Tuple[List[int], List[int]]:
        """Determines the active and inactive orbital indices.
        Args:
            grouped_property: the `GroupedElectronicProperty` to be transformed.
        Returns:
            The list of active and inactive orbital indices.
        """
        particle_number = grouped_property.get_property(ParticleNumber)
        if isinstance(self._num_electrons, tuple):
            num_alpha, num_beta = self._num_electrons
        elif isinstance(self._num_electrons, (int, np.integer)):
            num_alpha = num_beta = self._num_electrons // 2

        # compute number of inactive electrons
        nelec_total = particle_number._num_alpha + particle_number._num_beta
        nelec_inactive = nelec_total - num_alpha - num_beta

        self._validate_num_electrons(nelec_inactive)
        self._validate_num_orbitals(nelec_inactive, particle_number)

        # determine active and inactive orbital indices
        if self._active_orbitals is None:
            norbs_inactive = nelec_inactive // 2
            inactive_orbs_idxs = list(range(norbs_inactive))
            active_orbs_idxs = list(
                range(norbs_inactive, norbs_inactive + self._num_molecular_orbitals)
            )
        else:
            active_orbs_idxs = self._active_orbitals
            inactive_orbs_idxs = [
                o
                for o in range(nelec_total // 2)
                if o not in self._active_orbitals and self._mo_occ_total[o] > 0
            ]

        return (active_orbs_idxs, inactive_orbs_idxs)

Relevant issue : please see question 1 of https://github.com/tencent-quantum-lab/tensorcircuit/issues/120#issue-1570971882

mrossinek commented 1 year ago

The ActiveSpaceTransformer does not require you to provide active_orbitals. That is an optional input argument which allows you to hand-pick the active orbitals based on their indices. If you do not provide it, the active orbitals are automatically chosen around the Fermi level such that their occupation matches your selection of active number of electrons and number of orbitals. This is also explained in the documentation here. This tutorial also has a section on the ActiveSpaceTransformer where you find some examples of how to construct the class and use it.

That said, this is a very simple transformation as it simply takes the orbitals as given. There is no tooling in Qiskit Nature for more advanced (or automated) active space selection techniques. Picking your active orbitals often requires both, further optimization (or localization) of your orbitals as well as a lot of chemical intuition on which orbitals to pick. That is a whole area of research so and definitely exceeds of what I can summarize in this comment. That said, here are a few potentially useful resources that aid the user in finding an active set of orbitals from the top of my head (unordered and obviously incomplete):

https://arxiv.org/abs/1904.00097 (https://scine.ethz.ch/download/autocas)
https://arxiv.org/abs/1701.07862 (implemented in PySCF: pyscf.mcscf.avas)

And for larger systems where you want to confine your active orbitals to a subsystem in physical space (rather than orbitals in "energy space") you should look into localization and/or projection techniques. Depending on the method this might already produce a reduced problem size or it will simplify and the hand-picking of your active orbital set.

And as a final remark, I highly suggest that you switch to using the new and non-deprecated version of the ActiveSpaceTransformer which you can find here: https://github.com/Qiskit/qiskit-nature/blob/main/qiskit_nature/second_q/transformers/active_space_transformer.py (note the import-path is now qiskit_nature.second_q.transformers.ActiveSpaceTransformer).

buttercutter commented 1 year ago

I am checking the actual paper reference which seems to combine both density and wavefunction approaches. Could anyone explain more on this ?
When I checked further, it seems that there is no way for the user to explicitly provide the total number of electrons for ParameterNumber object instance.

Note that num_electrons class input is only for number of active electrons

Besides, I do not quite understand the intuition behind the following code snippet.

        if self._active_orbitals is None:
            norbs_inactive = nelec_inactive // 2
            inactive_orbs_idxs = list(range(norbs_inactive))
            active_orbs_idxs = list(
                range(norbs_inactive, norbs_inactive + self._num_molecular_orbitals)
            )

chatGPT gave the following reply which does not give the intuition, but rather only just reading through the code.

If the active_orbitals input is not provided by the user, the determine_active_space method computes the number of inactive electrons as the difference between the total number of electrons in the system and the number of electrons in the active space.

Then it computes the number of inactive orbitals (i.e., orbitals not included in the active space) as the number of inactive electrons divided by 2. It assumes that the active orbitals are the num_molecular_orbitals orbitals following the inactive orbitals.

So, the method determines the active orbital indices as num_molecular_orbitals orbitals starting from the first inactive orbital index, and the inactive orbital indices as the first nelec_inactive // 2 orbital indices. These indices are then returned as a tuple (active_orbs_idxs, inactive_orbs_idxs).

mrossinek commented 1 year ago

I am checking the actual paper reference which seems to combine both density and wavefunction approaches. Could anyone explain more on this ?

The ActiveSpaceTransformer provided in Qiskit Nature implements the embedding into HF orbitals. The extension towards embedding into DFT is discussed over in #26. There are no further updates on the DFT Embedding which I can provide at this time.

When I checked further, it seems that there is no way for the user to explicitly provide the total number of electrons for ParameterNumber object instance.

When you switch to using the refactored code of Qiskit Nature 0.5 (rather than 0.4) you can set the total number of electrons in your system via the ElectronicStructureProblem.num_particles attribute. See the migration guide for more details on how to update your code.

Besides, I do not quite understand the intuition behind the following code snippet.

Let me add some comments inline:

        # when no active orbital indices were hand-picked, we automatically pick them
        if self._active_orbitals is None:
            # we can infer the number of inactive orbitals from the number of inactive electrons
            # more specifically, all inactive orbitals are doubly occupied, hence we divide by 2
            norbs_inactive = nelec_inactive // 2
            # we then simply say that the lowest orbital indices are the inactive ones
            # e.g. [0, 1, 2] for nelec_inactive = 6
            inactive_orbs_idxs = list(range(norbs_inactive))
            # the active orbitals are now simply starting above the inactive orbitals
            # continuing the example of nelec_inactive = 6 (norbs_inactive = 3) we get:
            # [3, 4, 5] assuming self._num_molecular_orbitals = 3
            active_orbs_idxs = list(
                range(norbs_inactive, norbs_inactive + self._num_molecular_orbitals)
            )

buttercutter commented 1 year ago

class ActiveSpaceTransformer(num_electrons, num_spatial_orbitals, active_orbitals=None)

Given that num_electrons indicates the number of active electrons, how do we decide its value if we do not know about active_orbitals in the first place ? I feel that this is like a chicken-and-egg issue, please correct me if wrong.

I will also check the two arxiv papers you linked earlier

mrossinek commented 1 year ago

Normally you first pick the size of your problem to which you want to restrict your active space. And given a particular system based on chemical intuition you should be able to come up with a desired number of active electrons and a desired number of active orbitals to place them in.

But that is exactly where the difficulties of embedding approaches like this one arise :wink:

buttercutter commented 1 year ago

@mrossinek Thanks for your suggestion, I have the LUMO and HOMO molecular orbitals now for Fe2+ metallic ion for carbon capture MOFs.

molecular_orbitals_Fe2_CO2

buttercutter commented 1 year ago

regarding "active_orbitals" from the documentation it should only be used to enforce an active space that is not chosen purely around the Fermi level. most of the interaction are happening around the Fermi level and changing it for a compound such as (Fe2+)CO2 can give not accurate results as if it was centered around the Fermi level as the class will do thus I suggest to let active_orbitals=None by this the code will specify the HOMOs and LUMOs that are really involved in the calculation specifying active_orbitals can be fine in the case of single species molecules like N2, yet for a compound that can be misleading

Someone told me the above, could you advise further ?

qiskit-community / qiskit-nature

Question on how to properly obtain the in[active] orbitals for removal in ActiveSpaceTransformer #1087