robinzyb / cp2kdata

cp2k postprocessing tools
https://robinzyb.github.io/cp2kdata/
GNU Lesser General Public License v3.0
52 stars 18 forks source link

not work for systems with over 100 atoms #52

Closed hellozhaoming closed 1 month ago

hellozhaoming commented 3 months ago

if we parse the atomic kind of systems with over 100 atoms, the len of atomic_kind will be 99 while the len of atomic_kind_list will be larger than 99. in output.py def get_chemical_symbols_fake(self): if (self.atom_kind_list is not None) and (self.atomic_kind is not None): return self.atomic_kind[self.atom_kind_list-1] there will be an error 'index 99 is out of bounds for axis 0 with size 99'

This is because the output file in cp2k unable to display the number over 99, for example,

  1. Atomic kind: O99 Number of atoms: 1
  2. Atomic kind: O99 Number of atoms: 1 **. Atomic kind: O100 Number of atoms: 1

version of cp2k 7.1 version of cp2kdata 0.6.9

robinzyb commented 3 months ago

Could you provide the input and output to me? Usually, we won't set the number of atomic kinds larger than 100.

robinzyb commented 3 months ago

Also if there is no way to show all atomic kinds, all I can do is raise an error for this case.

hellozhaoming commented 3 months ago

The file size is too big and cannot upload as attach files. I share it with baidu netdist: 链接:https://pan.baidu.com/s/1fF-ODhDMyKxC8O40Wu97Xw 提取码:no74

hellozhaoming commented 3 months ago

I fixed the problem by a small modification of funcitons in atomic_kind.py

ATOMIC_KINDS_RE = re.compile(
    #r"""
    #\s{2}\d+\.\sAtomic\skind:\s+(?P<atomic_kind>\S+)
    #""",
    r"""
    \s{2}(\d+|\*+)\.\sAtomic\skind:\s+(?P<atomic_kind>\S+)
    """,
    re.VERBOSE
)
def parse_atomic_kinds(output_file):
    num_atomic_kinds_list = parse_num_atomic_kinds(output_file)
    atomic_kinds = []
    for match in ATOMIC_KINDS_RE.finditer(output_file):
        atomic_kinds.append(match["atomic_kind"].rstrip('0123456789'))
        #atomic_kinds.append(match["atomic_kind"])
    if atomic_kinds:
        # only return the last atomic kinds
        return np.array(atomic_kinds[-num_atomic_kinds_list[-1]:], dtype=str)
    else:
        return None
robinzyb commented 3 months ago

thanks. I think I still need to preserve this functionality. I can add extra choice in Cp2kOutput when you just need element symbols.

robinzyb commented 1 month ago

hi, I decide not to add fixed this problem. This is caused by cp2k reading cif file as input. you can try to start MD with xyz format so that only 4 types of element are counted.