Closed nathanlcarlson closed 5 years ago
PR #616 has been created to begin addressing these issues.
The 9 DB properties are missing
is actually accurate. After looking through the source I found the /esg/config/esgf.properties
parser starting on line 915 (the link should bring you right there). After reading through it some I deduced it is actually not finding the properties because of the white space around the equal sign. Like so
sample.property = value
It will find the property and value if it is instead
sample.property=value
A quick solution to this, as we are using Python's ConfigParser
, would be to set the space_around_delimiters
flag to False. Unfortunately this was released in Python 3. I am going to look at the ESGF homegrown config parser, based on Python's ConfigParser
.
Fortunately, the new features of configparser
in Python 3.6+ have been backported. See https://pypi.org/project/configparser/ . Also, it is already being installed, likely as a dependency of configobj
.
esgf-dashboard-ip
has a number of issues after successfully running /usr/local/esgf-dashboard-ip/bin/ip.service start
with httpd started.
[WARNING][esgf-dashboard-ip.c][589] Please note that 11 non-mandatory properties are missing in the esgf.properties file. Default values have been loaded
./.work/shards.xml:1: parser error : Space required after the Public Identifier
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
^
./.work/shards.xml:1: parser error : SystemLiteral " or ' expected
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
^
./.work/shards.xml:1: parser error : SYSTEM or PUBLIC, the URI is missing
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
^
[xmlreader.c:795] Error: could not parse file ./.work/shards.xml
START PLANA
[ERROR][dbAccess.c][1252] select url_path,id,user_id from esgf_dashboard.dashboard_queue where processed=0 and not url_path like 'http%' order by timestamp ASC limit 1000;
[ERROR][dbAccess.c][1252] select url_path,id,user_id from esgf_dashboard.dashboard_queue where processed=0 and not url_path like 'http%' order by timestamp ASC limit 1000;
[ERROR][dbAccess.c][1252] select url_path,id,user_id from esgf_dashboard.dashboard_queue where processed=0 and not url_path like 'http%' order by timestamp ASC limit 1000;
[ERROR][dbAccess.c][1252] select url_path,id,user_id from esgf_dashboard.dashboard_queue where processed=0 and not url_path like 'http%' order by timestamp ASC limit 1000;
NOTICE: table "cmip5_data_usage_continent_tmp" does not exist, skipping
[WARNING][dbAccess.c][291] Query submission failed [drop table if exists esgf_dashboard.cmip5_data_usage_continent_tmp; create table esgf_dashboard.cmip5_data_usage_continent_tmp as (SELECT EXTRACT (YEAR FROM (TIMESTAMP WITH TIME ZONE 'epoch' + fixed_log.date_fetched * INTERVAL '1 second')) AS year, EXTRACT (MONTH FROM (TIMESTAMP WITH TIME ZONE 'epoch' + fixed_log.date_fetched * INTERVAL '1 second')) AS month, COUNT(*) AS downloads, COUNT(distinct url) AS files, COUNT(distinct user_id_hash) AS users, SUM(fixed_log.size)/1024/1024/1024 AS gb, fixed_log.continent FROM (SELECT cl.continent, file.url, log.user_id_hash, max(log.timestamp) AS date_fetched, max(file.size) AS size FROM esgf_dashboard.dashboard_queue AS log JOIN esgf_dashboard.client_stats_dm AS cl ON (log.remote_addr=cl.ip) JOIN public.file_version AS file ON (UPPER(log.url_path) LIKE '%CMIP5%.NC' AND log.url_path=file.url) WHERE log.success AND log.duration > 1000 GROUP BY file.url, log.user_id_hash, cl.continent) AS fixed_log GROUP BY year, month, continent ORDER BY year, month);]
The error messages go one indefinitely and cannot be stopped with ctrl+c
.
This error is the result of there being no user_id
column in the specified table.
[ERROR][dbAccess.c][1252] select url_path,id,user_id from esgf_dashboard.dashboard_queue where processed=0 and not url_path like 'http%' order by timestamp ASC limit 1000;
This is a bit overwhelming and it may be a good idea to take a step back and re-evaluate this component. It has already consumed a large amount of time to get it to this point. It will likely require a significant amount of time still, based on these new error messages.
It certainly does not fit in being writing in C even though it, seemingly, performs very common tasks that existing Python components perform (db interactions, http requests, parsing xml, parsing config files, reporting information).
It seems there was a version mismatch between the schema defined in esgf-dashboard-ip
and that created by the esgf_dashboard_initialize
script. This was resolved by pulling the correct version from the mirror. It is important to note that the schema migration .egg files had the same name and were labeled the same version, but had different content.
https://aims1.llnl.gov/esgf/dist/2.6/8/esgf-dashboard/esgf_dashboard-0.0.2-py2.7.egg
!=
https://aims1.llnl.gov/esgf/dist/esgf-dashboard/esgf_dashboard-0.0.2-py2.7.egg
sleep(24 hours)
). A test was performed over the weekend to see if the data would update, and it did. This evidence is enough to say, unless something comes up, that the esgf-dashboard-ip service is in working order.
LD_LIBRARY_PATH
Using
Reports the following error
liblzma.so.5
is intended to be found in/usr/local/conda/envs/esgf-pub/lib
, but it is not included in theLD_LIBRARY_PATH
in/etc/esg.env
.ip.service
has the following line.This overwrites any manual attempt to specify the
LD_LIBRARY_PATH
with the specification in/etc/esg.env
. This environment variable should include the conda env lib path above.Missing Properties
Additionally, after manually resolving the above error, the following error is reported by
ip.service start
.My initial thoughts are that the
9 DB properties are missing
is misleading. Looking in/esg/config/esgf.properties
it looks like only the following are missingI do not know what these values should be.