AlexeyPechnikov / pygmtsar

PyGMTSAR (Python InSAR): Powerful and Accessible Satellite Interferometry
http://insar.dev/
BSD 3-Clause "New" or "Revised" License
425 stars 93 forks source link

[Help]: How to compile and install sbas_parallel script in centos system? #142

Closed MinervaLee0429 closed 3 months ago

MinervaLee0429 commented 3 months ago

@AlexeyPechnikov @SteffanDavies Dear professors and experts, I have a question that I would like to consult, because I found that the memory required for single-thread sbas processing is too large to meet when I was conducting SBAS-INSAR processing recently, so I want to use the script sbas_parallel for parallel processing in CentOS. However, after searching the relevant documents and consultation posts, I found that the relevant installation instructions are about Ubuntu system and mac system, I could not find the installation instructions for CentOS, and there is no reference content for the production of relevant flags, so I want to consult the experts on this problem. I am looking forward to your generous responses from experts and professors.

MinervaLee0429 commented 3 months ago

@AlexeyPechnikov Professor, I would like to ask you about the steps of compiling and installing sbas_parallel in centos system. Whether you and professor SteffanDavies at https://github.com/AlexeyPechnikov/pygmtsar/pull/23 Shared the same?

AlexeyPechnikov commented 3 months ago

If you are using PyGMTSAR, you do not need this binary. For GMTSAR-related inquiries, please direct your questions to the GMTSAR project.

MinervaLee0429 commented 3 months ago

@AlexeyPechnikov Professor, since the gmtsar project team did not provide relevant instructions, during the inquiry, I found that you and Professor SteffanDavies proposed such a description, so I would like to ask you about the installation instructions. Now I would like to confirm with Professor that this description applies to the centos system? I've previously sent questions to the gmtsar project and haven't received a response. Now I can only ask you first, if this instruction is applicable to centos, I intend to use this instruction provided by you to try to compile and install.

AlexeyPechnikov commented 3 months ago

For GMTSAR SBAS, you often need more than 512GB of RAM. While the sbas_parallel tool can utilize all your CPU cores, it does not limit memory consumption. For a compiled utility that runs on CentOS, Windows, MacOS, etc., consider using the GMTSAR Docker image available here: https://hub.docker.com/r/mobigroup/gmtsar. And to perform SBAS analysis quickly and without excessive memory usage, refer to the PyGMTSAR online interactive examples on the main page of the repository.

MinervaLee0429 commented 3 months ago

@AlexeyPechnikov Professor, I now use of processing data platform does not support you in the above said the content, just now I need your accurate reply, about whether https://github.com/AlexeyPechnikov/pygmtsar/pull/23 shows that suitable for centos system, If not, what part of this specification do I need to change to compile and install the sbas_parallel content?

AlexeyPechnikov commented 3 months ago

23 is related to Ubuntu linux. You can run the GMTSAR Docker image on your Centos system.

MinervaLee0429 commented 3 months ago

@AlexeyPechnikov Professor, the platform I am using now is a centos system supercomputer, I do not know whether this supercomputer can install the image you said, and I do not have the permission to change some configurations of the supercomputer to use this image.I tried to compile and install sbas_parallel before, but when I processed it, I found that this script could only call the memory of one node of the supercomputer and could not use the memory of other nodes, so I guess there was a mistake in the compilation and installation process. And the way the supercomputer works is that the terminal submits pbs scripts to the supercomputer, and I'm not sure if this image can meet the way the machine works.

AlexeyPechnikov commented 3 months ago

According to https://github.com/gmtsar/gmtsar/issues/941, your issue extends beyond just compilation problems:

this script can only call the memory of one node of the supercomputer ...

This is a longstanding issue. The GMTSAR SBAS tool was originally developed as a research tool for producing a series of papers for a dissertation and is only suitable for a limited number of small interferograms. From what I recall, GMTSAR SBAS combines spatial and temporal computations, leading to high memory consumption, and there’s little that can be done to mitigate this within the tool itself. To distribute processing across nodes effectively, you would need to separate the spatial and temporal analyses and appropriately allocate your dataset across nodes for each stage of processing, as implemented in PyGMTSAR. Furthermore, PyGMTSAR can handle large-scale processing even on a basic Apple Air laptop, eliminating the need for a supercomputer. You could write a wrapper to load your GMTSAR-preprocessed scenes (SLC, PRM, LED files) or interferograms into PyGMTSAR and conduct SBAS and PSI processing.

MinervaLee0429 commented 3 months ago

@AlexeyPechnikov Ok Professor, I understand what you are saying, I understand the advantages of pygmtsar needing less memory, but I have a question, does it take longer to run with pygmtsar? Because my original intention of using supercomputers is to take advantage of the larger processing memory and running speed of supercomputers. I would also like to ask, if I used gmtsar to process the last step, then I only need to do sbas to get the result of the sequential processing, if I used pygmtsar to process sbas_parallel, would the result of the previous step still be used? Can you use it directly if you can? Or do I need to start over?

AlexeyPechnikov commented 3 months ago

Using supercomputers for this task is neither fast nor easier if you’re unfamiliar with the process. Typically, supercomputers do not have a significant amount of RAM per node, rendering them ineffective for many common software applications not optimized for such environments. And have you already unwrapped and detrended your interferograms accurately, removing errors such as orbital ramps, ionospheric ramps, tidal phases, stratified and turbulent tropospheric delays, and so on? If so, you could potentially reduce the resolution of your interferograms to manage the task even with GMTSAR. If not, your SBAS results are likely to be biased and inaccurate, regardless of the software used.

MinervaLee0429 commented 3 months ago

@AlexeyPechnikov So, professor, the speed with pygmtsar is going to be slow, right? Since I've already done what I need with some of gmtsar's tools, following the sbas tutorial provided by gmtsar, I just need to run sbas_parallel to get the final result, but according to the professor, pygmtsar is not possible to use on supercomputers. Even if you could use it, it would be extremely limited and slow, right?

AlexeyPechnikov commented 3 months ago

You’ve inverted my words.

MinervaLee0429 commented 3 months ago

@AlexeyPechnikov Sorry Professor, I don't quite understand the specific meaning you are trying to convey to me, now I want to confirm three things, one is you said pygmtsar can be used in the centos system supercomputer? The second is if it can be used in a centos system supercomputer, will the processing speed be reduced? The last question is, if the previous two problems are solved and I use pygmtsar for the final sbas process, can the results of the previous steps that I used with gmtsar be used with sbas_paralleld in pygmtsar to get the final result?

AlexeyPechnikov commented 3 months ago

PyGMTSAR is a Python library which utilizes Sentinel-1 preprocessor from GMTSAR project and some other binary tools like SNAPHU unwrapper. In case you can compile GMTSAR and have modern Python installed on your platform you can use PyGMTSAR. Also, Docker images allow to run PyGMTSAR easily on many platforms.

There are multiple supercomputer architectures and you need to configure your soft property on any of them. It can be not easy. And why do you need a supercomputer at all? Do you really have dozens of terabytes of data? Few terabytes data processing can be done on a common laptop as I’ve mentioned above.

And do you already have prepared the unwrapped detrended interferograms? GMTSAR does not provide the tools for this step but this is mandatory. If you just made the interferograms that’s not enough for the SBAS processing. If you believe you could perform SBAS analysis with just GMTSAR tools that’s usually impossible.

MinervaLee0429 commented 3 months ago

@AlexeyPechnikov Professor, it is like this, I used to install a virtual machine in a laptop and a desktop computer for processing, but because of the unaffordable memory, the processing process was forced to stop, and it was also in order to pursue faster processing speed, so I transferred to the supercomputer for processing. As for the de-trending steps you mentioned, it is true that gmtsar itself does not provide such a tool, but I saw from another expert that you can use gmt grdtrend for de-trending operations. So Professor, I would like to ask you, can you use the docker image of gmtsar in the supercomputer? If it can be used, can the memory of other nodes of the supercomputer be mobilized in the run? Will speed be affected?

AlexeyPechnikov commented 3 months ago

You need to have Docker installed on the host. It can be running on a single node and use the one node resources only.

MinervaLee0429 commented 3 months ago

@AlexeyPechnikov Thank you for your response, Professor. Does this prevent processing from terminating due to insufficient RAM? Also, because the supercomputer platform uses the supercomputer to submit a pbs script to the supercomputer to use the resources of the supercomputer, otherwise, it can only be processed at the landing node, the configuration and performance of the landing node are very ordinary, and the processing speed will be slow. So I wanted to ask, is it possible to deliver pbs scripts to the supercomputer using docker? Also, since supercomputers operate from the command line, how do I adjust docker's RAM and cpu allocation?

AlexeyPechnikov commented 3 months ago

Pay attention GMTSAR can utilize only one node because it doesn’t have any inter processing communication ability. It means you have only single node RAM and CPU available. SBAS_parallel is not related to supercomputers but only to multicore CPU on a single node. Probably, the performance will be less than on your desktop computer. Docker is a command line tool but again you can run GMTSAR on a single node only.

MinervaLee0429 commented 3 months ago

@AlexeyPechnikov Thank you for your generous and patient reply, Professor. Now it seems that the docker image file of pygmtsar cannot meet my needs. If the processing speed will be reduced and the high configuration of supercomputer cannot be used, I have to give up the intention of using docker. I will try to find and experiment ways that sbas_parallel can use supercomputing on other nodes, but thank you for your patience.