Weiming-Hu / AnalogsEnsemble

The C++ and R packages for parallel ensemble forecasts using Analog Ensemble
https://weiming-hu.github.io/AnalogsEnsemble/
MIT License
18 stars 5 forks source link

Tips and Caveats #81

Open Weiming-Hu opened 4 years ago

Weiming-Hu commented 4 years ago

[R and C++] About times

All times, including forecast times, forecast lead times, and observation times, in the program use seconds as the unit. The origin is assumed to be 1970-01-01 and the time zone is UTC. This is important to keep in mind when you are using either the C++ or R interfaces.

Weiming-Hu commented 4 years ago

[C++] Using regular expressions

If you have the following forecast file name:

/home/graduate/wuh20/storage/data/NAM/nam_218_20130220_0600_000.grb2

An example of a regular expression to match the forecast cycle time 0600 would be as follows:

.*nam_218_\d{8}_(\d{2})\d{2}_\d{3}\.grb2$

Some explanations:

Weiming-Hu commented 4 years ago

[C++] Do I need quotes

If you are writing a config file, you don't need quotes when specifying arguments. For examples:

Weiming-Hu commented 4 years ago

[R and C++] Orders matter

During the data preparation stage for forecasts and observations, users are responsible to "align" the data, specifically the station dimension.

Recall that forecasts have 4 dimensions, namely [parameters, stations, times, lead times], and observations have 3 dimensions, namely [parameters, stations, times]. It is assumed that they both have the same number of stations (otherwise an error will be generated) and stations are in the same order in both datasets (no checks for station orders).

If you are using RAnEn, you have an argument sort.stations in the function formatObservations to help you reorder the data based on a particular station order. You need to process the forecast dataset with the same ordering rule.

Weiming-Hu commented 3 years ago

[C++ and Python] Install AnEn with PyTorch and Grid. CAREFUL

If -DENABLE_AI=ON -DBUILD_SHARED_LIBS=ON and -DBUILD_PYGRIB=ON, you will be building all libraries and CLI tools as shared objects, and presumably they are built with the libTorch that you have downloaded and extracted.

However, your python environment is likely to have its own libTorch version. So there might be two different versions of libTorch on your machine.

If you run make install, most likely you will get:

$ anen_grib --version
anen_grib: error while loading shared libraries: libtorch_cpu.so: cannot open shared object file: No such file or directory

Well, this can be solved easily by including a directory in the environment variable LD_LIBRARY_PATH. However, if two versions of libTroch have different symbols, it might cause additional problems.

So, the safest thing is not to make install, rather directly use the CLI tools under build/apps. Those tools should work right after make. You can also include those paths in your PATH so that they can be discovered from your terminal.

If you only have -DENABLE_AI=ON, you are still building static objects which would be fine to be installed. But you won't be able to use spatial metrics and Convolutional LSTM because they depend on the Grid library.