MaterialsGalaxy / larch-tools

Galaxy tool wrappers for Larch analysis tools for X-ray spectroscopy
MIT License
2 stars 0 forks source link

Paper 5 #30

Open patrick-austin opened 8 months ago

patrick-austin commented 8 months ago

https://github.com/UK-Catalysis-Hub/XAS-Workflow-Demo/blob/main/psdi_phase_1/larch/Paper%2005%20Reproduce%20XAS.ipynb

alex-belozerov commented 8 months ago

Summary of reproduction. There is an overall agreement between the results presented in paper 5 and those obtained using Galaxy. However, threre are some issues:

Please see below for a detailed comparison of all figures.

Fig. 2A. Galaxy reproduces the figure from Abraham's notebook perfectly. However, the curves from the original paper exhibit slightly different behavior at around 6540 and 6580 eV (see the curve for LaMnO3).

Original paper:

Image

Abraham's notebook:

Image

Galaxy:

Image

alex-belozerov commented 8 months ago

Fig. 2B The graph is well reproduced.

Original paper:

Image

Abraham's notebook:

Image

Galaxy:

Image

alex-belozerov commented 8 months ago

Fig. 3 The same features can be traced in all figures. However, the Galaxy graph appears less smeared, likely due to the rmin and rmax Fourier transform parameters, which cannot be set in Larch Athena tools. These parameters can be set in Larch Artemis, but in this case there is no option to plot all curve at the same figure.

Original paper:

Image

Abraham's notebook:

Image

Galaxy: Galaxy177- 0_chir_mag

alex-belozerov commented 8 months ago

Fig. 5 The same features can be traced in all figures. However, the Galaxy graph appears less smeared, likely due to the rmin and rmax Fourier transform parameters, which cannot be set in Larch Athena tools. These parameters can be set in Larch Artemis, but in this case there is no option to plot all curve at the same figure.

Original paper:

Image

Abraham's notebook:

Image

Galaxy: Galaxy179- 0_chir_mag

alex-belozerov commented 8 months ago

Fig. 4A The same features can be traced in all figures. However, there are some minor differences, which can be caused by a different number of parameters for the fit:

Original paper:

Image

Abraham's notebook:

Image

Galaxy: Galaxy182- ChiKR_plot_for_Fig _4A

alex-belozerov commented 8 months ago

Fig. 4B The same features can be traced in all figures. However, there are some minor differences, which can be caused by a different number of parameters for the fit.

Original paper:

Image

Abraham's notebook:

Image

Galaxy:

Image

alex-belozerov commented 8 months ago

Fig. 4C The same features can be traced in all figures. However, there are some minor differences, which can be caused by a different number of parameters for the fit.

It seems that wrong paths were selected in the Abraham's notebook for this figure. In particular, file mn_k_4_edge_sp.csv contains paths for LaMnO3, which should be used only for Fig. 4D.

Original paper:

Image

Abraham's notebook:

Image

Galaxy: Galaxy188- ChiKR_plot_for_Fig _4C

alex-belozerov commented 8 months ago

When specifing Mn as an absorbing atom in Larch FEFF for LaMnO3, I got an error. I also tried to specify it by index within the structure, but it didn't help.

The crystal structure file can be found here (GitHub doesn't allow to attach it directly): https://www.ccdc.cam.ac.uk/structures/Search?Ccdcid=1667441&DatabaseToSearch=ICSD

I have created a separate ticket for this issue.

patrick-austin commented 8 months ago

As with the other papers, how close we need the graphs to match in order to say the work is "reproduced" is unclear to me (fig. 2, 3, 5).

The same features can be traced in all figures. However, the Galaxy graph appears less smeared, likely due to the custom parameters for the Fourier transform used by both Abraham and in the paper.

Which parameters are these? Are you not able to provide them with the current tool options: https://github.com/MaterialsGalaxy/larch-tools/blob/d38888e9a53f605727f69c4270d0be9e63efb2fc/larch_artemis/larch_artemis.xml#L83-L104

The issue with La (I have run MnO2 OK before, and trying with pure Ln fails with the same error) is more troubling, as this seems to be outside our direct influence (i.e. not in any Python code but in FEFF's Fortran:

 : muffin tin radii and interstitial parameters
 : At line 90 of file istprm.f
 : Fortran runtime error: End of file

Based on your history in Galaxy, you manually uploaded the FEFF outputs - did you get these by running the standalone Demeter application(s)? Abraham's ipynb seems to be using the same version of FEFF as Galaxy is (6L.02) but compiled for Windows. Otherwise the only other thing I can think of that might help is uploading the .inp file that FEFF uses rather than the CIF file. I suspect that in both cases the issues lies in FEFF, but we could rule out the CIF -> .inp conversion that occurs in the Galaxy tool.

I also remember discussions about different version of FEFF not working for heavier atoms - which sounds promising except for the fact Abraham's notbook uses FEFF6 successfully, so differences in the Linux and Windows versions of FEFF seems to me like the only thing that could be causing this...

alex-belozerov commented 8 months ago

Thanks, Patrick, for your reply.

Which parameters are these? Are you not able to provide them with the current tool options: [larch-tools/larch_artemis/larch_artemis.xml]

You are right; it is possible to provide Fourier Transform parameters within Larch Artemis, and I overlooked that field. Actually, I used Larch Plot to calculate and plot |chi(R)| for several datasets on one figure in order to reproduce Figs. 3 and 5. I think it would also be great to be able to provide FT parameters there.

Based on your history in Galaxy, you manually uploaded the FEFF outputs - did you get these by running the standalone Demeter application(s)?

No, I ran the Abraham's ipynb utilizing Feff 6L.02 under linux and it worked fine for the same crystal structure (https://www.ccdc.cam.ac.uk/structures/Search?Ccdcid=1667441&DatabaseToSearch=ICSD).

Otherwise the only other thing I can think of that might help is uploading the .inp file that FEFF uses rather than the CIF file. I suspect that in both cases the issues lies in FEFF, but we could rule out the CIF -> .inp conversion that occurs in the Galaxy tool.

The .inp file works well in Larch FEFF. However, the file already contains info about the absorbing site and radius.

patrick-austin commented 8 months ago

You are right; it is possible to provide Fourier Transform parameters within Larch Artemis, and I overlooked that field. Actually, I used Larch Plot to calculate and plot |chi(R)| for several datasets on one figure in order to reproduce Figs. 3 and 5. I think it would also be great to be able to provide FT parameters there.

In that case, the current approach would to be to set XFTF to True in Larch Athena to expose the following settings: https://github.com/MaterialsGalaxy/larch-tools/blob/d38888e9a53f605727f69c4270d0be9e63efb2fc/larch_athena/larch_athena.xml#L163-L176

This should then save these parameters to the Athena project file, so that when reloading it with (e.g.) Larch Plot, these parameters will be used (instead of defaults). When originally adding the Larch Plot tool, it was to allow comparison plots, with the intention that all normalisation and pre-processing should be in Larch Athena (to give each tool a clearer purpose). Having said that, if this way of working is not intuitive, we could revisit the role of each tool and combine the two perhaps. Equally, if any XFTF settings needed from Athena are missing, we can add them.

No, I ran the Abraham's ipynb utilizing Feff 6L.02 under linux and it worked fine for the same crystal structure (https://www.ccdc.cam.ac.uk/structures/Search?Ccdcid=1667441&DatabaseToSearch=ICSD).

Otherwise the only other thing I can think of that might help is uploading the .inp file that FEFF uses rather than the CIF file. I suspect that in both cases the issues lies in FEFF, but we could rule out the CIF -> .inp conversion that occurs in the Galaxy tool.

The .inp file works well in Larch FEFF. However, the file already contains info about the absorbing site and radius.

Ah interesting, in which case the problem is presumably with pymatgen, the Python library we use to convert from CIF to FEFF .inp format. Hopefully this is a more tractable problem, as we can alter the Python code used to call that library, and/or pin the version if needed (on that latter point, could you list the package versions in the environment you run the notebook in?)

alex-belozerov commented 8 months ago

You are right; it is possible to provide Fourier Transform parameters within Larch Artemis, and I overlooked that field. Actually, I used Larch Plot to calculate and plot |chi(R)| for several datasets on one figure in order to reproduce Figs. 3 and 5. I think it would also be great to be able to provide FT parameters there.

In that case, the current approach would to be to set XFTF to True in Larch Athena to expose the following settings:

https://github.com/MaterialsGalaxy/larch-tools/blob/d38888e9a53f605727f69c4270d0be9e63efb2fc/larch_athena/larch_athena.xml#L163-L176

This should then save these parameters to the Athena project file, so that when reloading it with (e.g.) Larch Plot, these parameters will be used (instead of defaults). When originally adding the Larch Plot tool, it was to allow comparison plots, with the intention that all normalisation and pre-processing should be in Larch Athena (to give each tool a clearer purpose).

Patrick, thanks for the clarification. I will rerun my calculation using the FT parameters from Abraham's notebook. One more question: why are the rmin and rmax parameters presented in Larch Artemis but not in Larch Athena?

Having said that, if this way of working is not intuitive, we could revisit the role of each tool and combine the two perhaps. Equally, if any XFTF settings needed from Athena are missing, we can add them.

For me, it wasn't intuitive indeed, but I had no previous experience with XAS spectroscopy. Therefore, it might be straightforward for more experienced users of Athena and Artemis.

Ah interesting, in which case the problem is presumably with pymatgen, the Python library we use to convert from CIF to FEFF .inp format. Hopefully this is a more tractable problem, as we can alter the Python code used to call that library, and/or pin the version if needed (on that latter point, could you list the package versions in the environment you run the notebook in?)

I have pymatgen 2023.12.18 installed. The complete list of packages is shown below.

aiobotocore 2.7.0 aiohttp 3.9.0 aiohttp-retry 2.8.3 aioitertools 0.11.0 aiosignal 1.3.1 amqp 5.2.0 annotated-types 0.6.0 antlr4-python3-runtime 4.9.3 anyio 4.0.0 appdirs 1.4.4 apt-clone 0.2.1 apturl 0.5.2 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 asteval 0.9.31 asttokens 2.4.0 async-lru 2.0.4 async-timeout 4.0.3 asyncssh 2.14.1 atpublic 4.0 attrs 23.1.0 awscli 1.22.34 Babel 2.13.0 backcall 0.2.0 bcrypt 4.1.2 beautifulsoup4 4.10.0 billiard 4.2.0 bleach 6.1.0 blinker 1.7.0 boto3 1.28.64 botocore 1.31.64 Brlapi 0.8.3 cachetools 5.3.2 celery 5.3.5 certifi 2020.6.20 cffi 1.16.0 chardet 4.0.0 charset-normalizer 3.3.0 click 8.1.7 click-didyoumean 0.3.0 click-plugins 1.1.1 click-repl 0.3.0 colorama 0.4.4 comm 0.1.4 command-not-found 0.3 configobj 5.0.6 contourpy 1.2.0 cryptography 41.0.5 cupshelpers 1.0 cycler 0.12.1 dbus-python 1.2.18 debugpy 1.8.0 decorator 5.1.1 defer 1.0.6 defusedxml 0.7.1 dictdiffer 0.9.0 dill 0.3.7 diskcache 5.6.3 distro 1.7.0 dnspython 2.4.2 docutils 0.17.1 dpath 2.1.6 dulwich 0.21.6 dvc 3.30.1 dvc-data 2.22.0 dvc-gdrive 2.20.0 dvc-http 2.30.2 dvc-objects 1.2.0 dvc-render 0.6.0 dvc-s3 2.23.0 dvc-studio-client 0.15.0 dvc-task 0.3.0 emmet-core 0.75.0 entrypoints 0.4 exceptiongroup 1.1.3 executing 2.0.0 eyeD3 0.8.10 fabio 2023.10.0 fastapi 0.108.0 fastjsonschema 2.18.1 filelock 3.6.0 Flask 3.0.0 flatten-dict 0.4.2 flufl.lock 7.1.1 fonttools 4.47.0 fqdn 1.5.1 frozenlist 1.4.0 fsspec 2023.10.0 funcy 2.0 future 0.18.3 gitdb 4.0.11 GitPython 3.1.40 google-api-core 2.14.0 google-api-python-client 2.108.0 google-auth 2.23.4 google-auth-httplib2 0.1.1 googleapis-common-protos 1.61.0 gpg 1.16.0 grandalf 0.8 greenlet 3.0.3 grpcio 1.30.2 gto 1.5.0 h11 0.14.0 h5py 3.10.0 hdf5plugin 4.3.0 httplib2 0.20.2 hydra-core 1.3.2 idna 3.3 ifaddr 0.1.7 imageio 2.33.1 IMDbPY 2021.4.18 importlib-metadata 4.6.4 ipykernel 6.25.2 ipysheet 0.7.0 ipython 8.16.1 ipywidgets 8.1.1 isoduration 20.11.0 iterative-telemetry 0.0.8 itsdangerous 2.1.2 jedi 0.19.1 jeepney 0.7.1 Jinja2 3.1.2 jmespath 1.0.1 joblib 1.3.2 json5 0.9.14 jsonpointer 2.4 jsonschema 4.19.1 jsonschema-specifications 2023.7.1 jupyter_client 8.4.0 jupyter_core 5.4.0 jupyter-events 0.8.0 jupyter-lsp 2.2.0 jupyter_server 2.8.0 jupyter_server_terminals 0.4.4 jupyterlab 4.0.7 jupyterlab-pygments 0.2.2 jupyterlab_server 2.25.0 jupyterlab-widgets 3.0.9 keyring 23.5.0 kiwisolver 1.4.5 kombu 5.3.4 larch 4.0 latexcodec 2.0.1 launchpadlib 1.10.16 lazr.restfulclient 0.14.4 lazr.uri 1.0.6 lazy_loader 0.3 lmfit 1.2.2 louis 3.20.0 lxml 4.9.4 macaroonbakery 1.3.1 maggma 0.60.0 Mako 1.1.3 markdown-it-py 3.0.0 MarkupSafe 2.1.3 matplotlib 3.8.2 matplotlib-inline 0.1.6 mdurl 0.1.2 mistune 3.0.2 mongogrant 0.3.3 mongomock 4.1.2 monty 2023.11.3 more-itertools 8.10.0 mp-api 0.39.4 mpmath 1.3.0 msgpack 1.0.7 multidict 6.0.4 nbclient 0.8.0 nbconvert 7.9.2 nbformat 5.9.2 nemo-emblems 5.8.0 nest-asyncio 1.5.8 netaddr 0.8.0 netifaces 0.11.0 networkx 3.2.1 notebook 7.0.6 notebook_shim 0.2.3 numdifftools 0.9.41 numexpr 2.8.8 numpy 1.26.1 oauth2client 4.1.3 oauthlib 3.2.0 omegaconf 2.3.0 onboard 1.4.1 orjson 3.9.10 overrides 7.4.0 packaging 21.3 palettable 3.3.3 PAM 0.4.2 pandas 2.1.4 pandocfilters 1.5.0 paramiko 3.4.0 parso 0.8.3 pathspec 0.11.2 PeakUtils 1.3.4 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.0.1 pip 22.0.2 platformdirs 3.11.0 plotly 5.18.0 ply 3.11 prometheus-client 0.17.1 prompt-toolkit 3.0.39 protobuf 4.25.1 psutil 5.9.0 ptyprocess 0.7.0 pure-eval 0.2.2 pyasn1 0.4.8 pyasn1-modules 0.3.0 pybtex 0.24.0 pycairo 1.20.1 PyCifRW 4.4.6 pycparser 2.21 pycups 2.0.1 pycurl 7.44.1 pydantic 2.5.1 pydantic_core 2.14.3 pydantic-settings 2.1.0 pydash 7.0.6 pydot 1.4.2 PyDrive2 1.17.0 pyelftools 0.27 pyfai 2023.9.0 pygit2 1.13.2 Pygments 2.16.1 PyGObject 3.42.1 pygtrie 2.5.0 PyICU 2.8.1 pyinotify 0.9.6 PyJWT 2.3.0 pymacaroons 0.13.0 pymatgen 2023.12.18 pymongo 4.6.1 PyNaCl 1.5.0 pyOpenSSL 23.3.0 pyparsing 2.4.7 pyparted 3.11.7 PyQt5 5.15.6 PyQt5-sip 12.9.1 pyRFC3339 1.1 pyshortcuts 1.9.0 PySocks 1.7.1 python-apt 2.4.0+ubuntu2 python-dateutil 2.8.2 python-debian 0.1.43+ubuntu1.1 python-dotenv 1.0.0 python-gnupg 0.4.8 python-json-logger 2.0.7 python-magic 0.4.24 python-xlib 0.29 pytz 2022.1 pyxdg 0.27 PyYAML 5.4.1 pyzmq 25.1.1 referencing 0.30.2 reportlab 3.6.8 requests 2.31.0 requests-file 1.5.1 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 13.7.0 roman 3.3 rpds-py 0.10.6 rsa 4.8 ruamel.yaml 0.17.40 ruamel.yaml.clib 0.2.8 s3fs 2023.10.0 s3transfer 0.7.0 scikit-image 0.22.0 scikit-learn 1.3.2 scipy 1.11.3 scmrepo 1.4.1 SecretStorage 3.3.1 semver 3.0.2 Send2Trash 1.8.2 sentinels 1.0.0 setproctitle 1.2.2 setuptools 59.6.0 shortuuid 1.0.11 shtab 1.6.4 silx 1.1.2 six 1.16.0 smmap 5.0.1 sniffio 1.3.0 soupsieve 2.3.1 spglib 2.2.0 SQLAlchemy 2.0.24 SQLAlchemy-Utils 0.41.1 sqltrie 0.8.0 sshtunnel 0.4.0 stack-data 0.6.3 starlette 0.32.0.post1 sympy 1.12 systemd-python 234 tabulate 0.9.0 tenacity 8.2.3 termcolor 2.4.0 terminado 0.17.1 threadpoolctl 3.2.0 tifffile 2023.12.9 tinycss2 1.1.1 tldextract 3.1.2 toml 0.10.2 tomli 2.0.1 tomlkit 0.12.3 torbrowser-launcher 0.3.3 tornado 6.3.3 tqdm 4.66.1 traitlets 5.11.2 typer 0.9.0 types-python-dateutil 2.8.19.14 typing_extensions 4.8.0 tzdata 2023.3 ubuntu-drivers-common 0.0.0 ufw 0.36.1 uncertainties 3.1.7 Unidecode 1.3.3 uri-template 1.3.0 uritemplate 4.1.1 urllib3 1.26.5 uvicorn 0.25.0 vine 5.1.0 voluptuous 0.14.1 wadllib 1.3.6 wcwidth 0.2.8 webcolors 1.13 webencodings 0.5.1 websocket-client 1.6.4 Werkzeug 3.0.1 wheel 0.37.1 widgetsnbextension 4.0.9 wrapt 1.16.0 xdg 5 xkit 0.0.0 xlrd 1.2.0 xraydb 4.5.4 xraylarch 0.9.74 yarl 1.9.2 youtube-dl 2021.12.17 zc.lockfile 3.0.post1 zipp 1.0.0

patrick-austin commented 8 months ago

One more question: why are the rmin and rmax parameters presented in Larch Artemis but not in Larch Athena?

Performing the FT in the Athena tool calls the xftf function from Larch, which is explicitly a forward transform from k-space to r-space. In Artemis, the feffit_transform function is to allow you to choose whether to perform the fit in k-space, r-space or one of the other options. The extra options is has are to allow back transforms (in the Larch notation, I believe it goes k -> r -> q - so choosing to do the fit in q would utilise both the k and r settings but it should describe it in more detail in the documentation somewhere). At least that's my understanding of how the two differ. It's also worth noting the more advanced settings may not have been added (yet) and will use the defaults from the documentation, but this can be expanded as needed.

Thanks for the list of packages, I'll have a look at what the problem might and how we can fix it when I get the chance.

alex-belozerov commented 7 months ago

One more question: why are the rmin and rmax parameters presented in Larch Artemis but not in Larch Athena?

Performing the FT in the Athena tool calls the xftf function from Larch, which is explicitly a forward transform from k-space to r-space. In Artemis, the feffit_transform function is to allow you to choose whether to perform the fit in k-space, r-space or one of the other options. The extra options is has are to allow back transforms (in the Larch notation, I believe it goes k -> r -> q - so choosing to do the fit in q would utilise both the k and r settings but it should describe it in more detail in the documentation somewhere). At least that's my understanding of how the two differ. It's also worth noting the more advanced settings may not have been added (yet) and will use the defaults from the documentation, but this can be expanded as needed.

Patrick, thanks for the clarification. As you mentioned, the rmin and rmax parameters can only be set in the Larch Artemis tool. However, there is no option to plot several curves on the same plot in Larch Artemis. To overcome this, we could use feffit_transform function in Larch Plot instead of xftf. What do you think about it?

I updated the Galaxy results for Figs. 3-5 above, using Abraham's FT parameters. As expected, this improved the agreement with Abraham's results. However, to get all curves on the same plot in Figs. 3 and 5, I used Larch Plot, where the rmin and rmax FT parameters are omitted.

patrick-austin commented 7 months ago

Just to write up notes on what I mentioned in the meeting:

The general principle I've been following until now was that the plot tool shouldn't accept the Athena/Artemis parameters, otherwise it will just duplicate those tools. Athena parameters are loaded from the .prj but the user can't input them. Practically speaking, I'm not sure if just adding the feffit_transform group would be enough without also adding inputs for the FEFF paths etc. (even if we can perform the forward and reverse transform this would just be the experimental data and not include the FEFF fit). Hopefully a wider effort to improve the plotting outputs will make this easier (possibly outputting plaintext from Artemis that can be combined by the plotting tool with other outputs for comparison)?