Closed dlebauer closed 8 years ago
@czender : Charlie, if you could send the following information to yanliu@illinois.edu , I can then go ahead to set up your account on ROGER cluster: your organization name and mailing address (for sending NCSA paperwork) phone number org email adress
Done
I have submitted the request to set up an NCSA account for you. Will let you know when it's done.
@czender NCSA account created. Paperwork will be sent to you today shortly.
i received the password. how do i ssh to the roger cluster? i.e., what's the address of the login node?
the login node is roger-login.ncsa.illinois.edu
On Thu, Apr 21, 2016 at 4:19 PM, Charlie Zender notifications@github.com wrote:
i received the password. how do i ssh to the roger cluster? i.e., what's the address of the login node?
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/terraref/computing-pipeline/issues/86#issuecomment-213118159
roger does not accept my password. old or new, doesn't matter. do i actually have an account? zender@roger-login.ncsa.illinois.edu's password: Connection closed by 141.142.169.31
my password started to work an hour later, everything's ok now
it should work now. I just added your account to ROGER.
is there project disk space I should use? pipeline needs volatile space for intermediate datasets (e.g., /tmp), and a few GB to permanently store sample data for pipeline testing. where are the realtime/latest lemnatec data (expected to be) stored?
/projects/arpae is the project space /gpfs_scratch/arpae is the scratch space. you can use as temp space.
@dlebauer @jdmaloney could you help answer the question of realtime lemnatec data location?
i found the raw lemnatec data in, e.g., /projects/arpae/terraref/raw_data/ua-mac/MovingSensor/VNIR. next question: where to put output from hyperspectral workflow? options:
I vote for 2.
2 is good.
/projects/arpae/terraref/raw_data/
is read-only by design/projects/arpae/terraref/raw_data/ua-mac/MovingSensor/VNIR/2016-04-11
has structure $PROJECT/<data_level>/<site>/<sensor_class>/<date>
that for a derived level 0 data product might become /projects/arpae/terraref/level_0/ua-mac/hyperspectral
? @robkooper Rob - what do you think would work best (from above or anything other than random alphanumeric strings)?
hyperspectral workflow is installed and breaks when it can't find python netCDF4 module (yes i tried module load python and pythonlibs). hints on how to find it or use it would be appreciated (i'm a novice at python) or please someone who's a python guru install it in a central location consistent with the "module load python" paths...
could you try "module load gdal-stack"? it loads netcdf4 correctly. you might need to do "module purge" first.
is python module for netcdf a separate software not included in netcdf source code?
yes
i will install it as soon as possible.
what else does the code depend on? i will install them too.
Just got back in town from Globus to come to a 3pm meeting (my fill in had to leave as his wife had a baby today...legit excuse). I definitely think Option 2 is the best, something like: /projects/arpae/terraref/processed_data/ua-mac/..... to keep things consistent. Keeping all derivative data separate from the raw (for all processing, not just hyper-spectral) should be a template going forward.
Thanks, @yanliu-chn. Only thing missing is Python's netCDF4 module. It would also be helpful for system NCO to be updated. module load nco gives version 4.5.1, which is pretty old. 4.5.9 is latest stable version.
I don't have write permission in /projects/arpae/terraref. Ping me when there's a writable directory for the hyperspectral workflow output and then I'll make that the default...
@dlebauer what is your policy for the write permission to /projects/arpae/terraref ? can we make it writeable by group but make subdir in it writeable by only owner for readonly dirs?
@yanliu-chn I like your suggestion and also defer to others (like you and JD) with more knowledge and experience
So we can go with:
writeable by group but make subdir in it writeable by only owner for readonly dirs?
But regarding the subdirectories: I think very restricted access to the raw_data
and processed_data
folders would be ideal. By very restricted, perhaps raw_data
only writable by root and globus, and processed_data
only by root and clowder. Specifically:
/projects/arpae/terraref/raw_data/
write access only for root and transfer agents like globusterraref/
and discuss new use cases)processed_data
as JD suggested).Does this seem like a sufficiently conservative approach? Does it put any undue restrictions on users (e.g. would this work for you Charlie)?
Yes, that works for me. Default will be /gpfs_scratch/arpae/imaging_spectrometer. Clowder would just need to use the option: -O /projects/arpae/terraref/processed_data/subdir/subdir...
@yanliu-chn try this, if you dare: module load anaconda sudo conda install netcdf4
i didn't dare to do that. :) on the cluster, all the software packages are installed as non-root to support modules system.
I have installed python support for netcdf5, also upgraded nco to 4.5.5. I didn't find 4.5.9 src tarball.
if you do module load gdal-stack nco
, you will load pythonlibs/0.2 which includes netCDF4 python. let me know if it works.
Thanks, @yanliu-chn. However, the Terraref workflow requires Python 2.7+. /usr/bin/python is 2.6.6 (too old). module load python gives 3.4.4 (fine). The netCDF4 that comes with gdal-stack seems to be installed against 2.6.6, so is unusable the Terraref workflow. Unless I am doing something wrong? In any case, need help getting python 2.7+ at same time as netCDF4 python module.
let me build a version using 2.7+
done. could you do "module load gdal-stack-2.7.10 nco" and test if it works? or you can give a test command and I can try that first.
Progress. it fails, but it breaks somewhere new (that you might be able to fix). first, the test command is...
terraref.sh -d 1 -i ~zender/data/terraref/whiteReference_raw -o whiteReference.nc
where terraref.sh = https://github.com/terraref/computing-pipeline/tree/master/scripts/hyperspectral/terraref.sh
Second, here is the failure output I get:
zender@cg-gpu01:~$ !terraref.sh
terraref.sh -d 1 -i ${DATA}/terraref/whiteReference_raw -o whiteReference.nc
Terraref data pipeline invoked with command:
terraref.sh -d 1 -i /home/zender/data/terraref/whiteReference_raw -o whiteReference.nc
Input #00: /home/zender/data/terraref/whiteReference_raw
trn(in) : /home/zender/data/terraref/whiteReference_raw
trn(out) : /gpfs_scratch/arpae/hyperspectral_imager/terraref_tmp_trn.nc.pid24628.fl00.tmp
ncks -O --trr_wxy=926,1600,1 --trr typ_in=NC_USHORT --trr typ_out=NC_USHORT --trr ntl_in=bil --trr ntl_out=bsq --trr_in=/home/zender/data/terraref/whiteReference_raw ~/nco/data/in.nc /gpfs_scratch/arpae/hyperspectral_imager/terraref_tmp_trn.nc.pid24628.fl00.tmp
att(in) : /gpfs_scratch/arpae/hyperspectral_imager/terraref_tmp_trn.nc.pid24628.fl00.tmp
att(out) : /gpfs_scratch/arpae/hyperspectral_imager/terraref_tmp_att.nc.pid24628.fl00.tmp
ncatted -O --gaa terraref_script=terraref.sh --gaa terraref_hostname=cg-gpu01 --gaa terraref_version="4.6.0-beta03" -a "Conventions,global,o,c,CF-1.5" -a "Project,global,o,c,TERRAREF" --gaa history='Mon Apr 25 11:33:08 PDT 2016: terraref.sh -d 1 -i /home/zender/data/terraref/whiteReference_raw -o whiteReference.nc' /gpfs_scratch/arpae/hyperspectral_imager/terraref_tmp_trn.nc.pid24628.fl00.tmp /gpfs_scratch/arpae/hyperspectral_imager/terraref_tmp_att.nc.pid24628.fl00.tmp
jsn(in) : /home/zender/data/terraref/whiteReference_raw
jsn(out) : /gpfs_scratch/arpae/hyperspectral_imager/terraref_tmp_jsn.nc.pid24628
python /home/zender/terraref/computing-pipeline/scripts/hyperspectral/JsonDealer.py /home/zender/data/terraref/whiteReference_raw /gpfs_scratch/arpae/hyperspectral_imager/terraref_tmp_jsn.nc.pid24628.fl00.tmp
Traceback (most recent call last):
File "/home/zender/terraref/computing-pipeline/scripts/hyperspectral/JsonDealer.py", line 50, in <module>
from netCDF4 import Dataset
File "/sw/pylibs-2.7.10/lib/python2.7/site-packages/netCDF4/__init__.py", line 3, in <module>
from ._netCDF4 import *
ImportError: /sw/pylibs-2.7.10/lib/python2.7/site-packages/netCDF4/_netCDF4.so: undefined symbol: _Py_ZeroStruct
terraref.sh: ERROR Failed to parse JSON metadata. Debug this:
python /home/zender/terraref/computing-pipeline/scripts/hyperspectral/JsonDealer.py /home/zender/data/terraref/whiteReference_raw /gpfs_scratch/arpae/hyperspectral_imager/terraref_tmp_jsn.nc.pid24628.fl00.tmp
could you do "module purge" first before module load? I ran the last debug line python /home/zender/terraref/computing-pipeline/scripts/hyperspectral/JsonDealer.py /home/zender/data/terraref/whiteReference_raw /gpfs_scratch/arpae/hyperspectral_imager/terraref_tmp_jsn.nc.pid24628.fl00.tmp
and it worked.
i guess the module load command needs to be included in ~/.bashrc .
yes, it works now that i put "module load gdal-stack-2.7.10" directly in the script. in general, do not load the nco module yourself because the script points to the latest NCO in my home directory, and the terraref.sh script may rely on some more recent features. users of terraref.sh need not load any modules by hand. running some benchmarks now. thanks for your speedy help, yanliu.
glad that it works now.
@yanliu-chn please set-up an account on Roger for @FlyingWithJerome Jerome Mao, my student who is developing the environmental logger workflow. He will send you his 411.
Jerome @FlyingWithJerome , could you send the following information to yanliu@illinois.edu , I can then go ahead to set up your account on ROGER cluster: your organization name and mailing address (for sending NCSA paperwork) phone number org email adress
I have created the request for your NCSA and ROGER account. It will take a few business days for the paperwork to arrive. Please check your mailbox.
@yanliu-chn thanks!
@yanliu-chn in the last few weeks we released NCO 4.6.0 which contains all the features necessary for Terraref. If system NCO is updated to 4.6.0, then hyperspectral workflow can be run with stable system executables rather than bleeding-edge snapshots from my personal directory :) https://github.com/nco/nco/releases
I will update the NCO version on ROGER asap and let you know when it is done.
Done. I have built 4.6.0 and make it the default to load when doing "module load nco".
@czender could you test if the build works as expected?
It fails because the system NCO package was not installed with ncap2. Building ncap2 requires first building or installing ANTLR as described on the NCO homepage nco.sf.net
I see. I rebuilt nco 4.6.0 with ncap2 and antlr-2.7.7 . could you load it with module load gdal-stack-2.7.10 nco
and test it again?
@czender - were you able to test this?
I think Charlie mentioned that this has been tested.
yes i tested this (two weeks ago) and it works. sorry for not updating this earlier.
@czender can this be closed then?
i don't think so. there is no clowder extractor that i know of for terraref.sh.
right. my team will develop the extractor next.
@yanliu-chn and @czender