radical-cybertools / radical.analytics

Analytics for RADICAL-Cybertools
Other
1 stars 1 forks source link

issues with pulling sessions #56

Closed jdakka closed 6 years ago

jdakka commented 6 years ago

not an analytics issue but RP issue, however discovered when using the ra.Session:

session = ra.Session(stype='radical.pilot', src='{0}_replicas/'.format(pipelines))

Issue:

---> 12     session = ra.Session(stype='radical.pilot', src='{0}_replicas/'.format(pipelines))
     13     units = session.filter(etype='unit', inplace=False)
     14 

/Users/JumanaDakka/environments/venv/lib/python2.7/site-packages/radical/analytics/session.pyc in __init__(self, src, stype, sid, _entities, _init)
     78             self._profile, accuracy, hostmap \
     79                               = rp.utils.get_session_profile(sid=sid, src=self._src)
---> 80             self._description = rp.utils.get_session_description(sid=sid, src=self._src)
     81 
     82             self._description['accuracy'] = accuracy

/Users/JumanaDakka/environments/venv/lib/python2.7/site-packages/radical/pilot/utils/prof_utils.pyc in get_session_description(sid, src, dburl)
    127         json = ru.read_json('%s/%s.json' % (src, sid))
    128     else:
--> 129         ftmp = fetch_json(sid=sid, dburl=dburl, tgt=src, skip_existing=True)
    130         json = ru.read_json(ftmp)
    131 

/Users/JumanaDakka/environments/venv/lib/python2.7/site-packages/radical/pilot/utils/session.pyc in fetch_json(sid, dburl, tgt, skip_existing, session, log)
    397         mongo, db, _, _, _ = ru.mongodb_connect(dburl)
    398 
--> 399         json_docs = get_session_docs(db, sid)
    400         ru.write_json(json_docs, dst)
    401 

/Users/JumanaDakka/environments/venv/lib/python2.7/site-packages/radical/pilot/utils/db_utils.pyc in get_session_docs(db, sid, cache, cachedir)
     79 
     80     # convert bson to json, i.e. serialize the ObjectIDs into strings.
---> 81     json_data['session'] = bson2json(list(db[sid].find({'type' : 'session'})))
     82     json_data['pmgr'   ] = bson2json(list(db[sid].find({'type' : 'pmgr'   })))
     83     json_data['pilot'  ] = bson2json(list(db[sid].find({'type' : 'pilot'  })))

/Users/JumanaDakka/environments/venv/lib/python2.7/site-packages/pymongo/database.pyc in __getitem__(self, name)
    199           - `name`: the name of the collection to get
    200         """
--> 201         return self.__getattr__(name)
    202 
    203     def create_collection(self, name, **kwargs):

/Users/JumanaDakka/environments/venv/lib/python2.7/site-packages/pymongo/database.pyc in __getattr__(self, name)
    189           - `name`: the name of the collection to get
    190         """
--> 191         return Collection(self, name)
    192 
    193     def __getitem__(self, name):

/Users/JumanaDakka/environments/venv/lib/python2.7/site-packages/pymongo/collection.pyc in __init__(self, database, name, create, **kwargs)
    102 
    103         if not name or ".." in name:
--> 104             raise InvalidName("collection names cannot be empty")
    105         if "$" in name and not (name.startswith("oplog.$main") or
    106                                 name.startswith("$cmd")):

InvalidName: collection names cannot be empty

Stack:

 python               : 2.7.12
  pythonpath           : 
  virtualenv           : /Users/JumanaDakka/environments/venv

  radical.analytics    : v0.45.2-86-g99480a1@rc-v0.46.3
  radical.pilot        : 0.47-v0.46.2-183-g2c92e51@rc-v0.46.3
  radical.utils        : 0.47-v0.46-73-gd580ab1@rc-v0.46.3
  saga                 : 0.47-v0.46-32-ga2f9ded@HEAD-detached-at-origin-rc-v0.46.3

export RADICAL_PILOT_DBURL="mongodb://jdakka:mongo@ds231715.mlab.com:31715/new_htbac"
andre-merzky commented 6 years ago

Can you please check what '{0}_replicas/'.format(pipelines)) expands to? Also, would you mind checking what sid is set to in radical/analytics/session.py, in the constructor, before rp.utils.get_session_profile is getting called? Thanks!

jdakka commented 6 years ago

{0}_replicas/'.format(pipelines)) expands to:

['./16_replicas/rp.session.two.jdakka.017476.0003/rp.session.two.jdakka.017476.0003.json']

Here's the constructor:


    def __init__(self, sid, stype, src=None, _entities=None, _init=True):
        """
        Create a radical.analytics session for analysis.

        The session is created from a set of profiles, which usually have been
        produced from some other session object in the RCT stack, such as
        radical.pilot.  The `ra.Session` constructor expects the respecive
        session ID and session type.  It optionally accepts a `src` parameter
        which can point to a location where the profiles are expected to be
        found, or where they will be stored after fetching them.  The default
        value for `src` is `$PWD/sid`.
        """

        self._sid   = sid
        self._stype = stype

        if not src:
            src = "%s/%s" % (os.getcwd(), sid)

        self._src = src

        if stype == 'radical.pilot':
            import radical.pilot as rp
            self._profile, accuracy, hostmap \
                              = rp.utils.get_session_profile    (sid=sid, src=self._src)
            self._description = rp.utils.get_session_description(sid=sid, src=self._src)

            self._description['accuracy'] = accuracy
            self._description['hostmap']  = hostmap

        else:
            raise ValueError('unsupported session type [%s]' % stype)

        self._t_start     = None
        self._t_stop      = None
        self._ttc         = None

        self._log         = None

        # internal state is represented by a dict of entities:
        # dict keys are entity uids (which are assumed to be unique per
        # session), dict values are ra.Entity instances.
        self._entities = dict()
        if _init:
            self._initialize_entities()

        # we do some bookkeeping in self._properties where we keep a list of
        # property values around which we encountered in self._entities.
        self._properties = dict()
        if _init:
            self._initialize_properties()

        # FIXME: we should do a sanity check that all encountered states and
        #        events are part of the respective state and event models
jdakka commented 6 years ago

Also here is the path to try this out:

https://github.com/radical-experiments/htbac-experiments/tree/master/esmacs_entk06/namd-entk-6-barrier_correct_timing

andre-merzky commented 6 years ago

Ah, ok - the 'src' attribute should point to a directory which contains the profiles and the json, so I guess to ./16_replicas/rp.session.two.jdakka.017476.0003/. Can you please give that a try?

jdakka commented 6 years ago

Fixed that issue however the profiles show session duration and then rp durations as 0.0 ['./16_replicas/rp.session.two.jdakka.017476.0003.json'] 3059.78859997 0.0 0.0 0.0

I uploaded the working version in the path https://github.com/radical-experiments/htbac-experiments/tree/master/esmacs_entk06/namd-entk-6-barrier_correct_timing

andre-merzky commented 6 years ago

Thanks - let me try to reproduce this. The profiles look ok actually.

andre-merzky commented 6 years ago

Hey Jumana,

this seems to be caused by the directory structure. An RP profile tree would look somewhat like this:

rp.session.two.jdakka.017476.0003/
rp.session.two.jdakka.017476.0003/update.0.prof
rp.session.two.jdakka.017476.0003/...
rp.session.two.jdakka.017476.0003/rp.session.two.jdakka.017476.0003.json
rp.session.two.jdakka.017476.0003/pilot.0000/
rp.session.two.jdakka.017476.0003/pilot.0000/agent_0.prof
rp.session.two.jdakka.017476.0003/pilot.0000/...

and that session dir is what you pass as src to the ra.Session constructor. If I resort your session that way, all profiles are found, and unit durations appear.

jdakka commented 6 years ago

solved by using following stack

radical.analytics    : v0.45.2-86-g99480a1@rc-v0.46.3
radical.pilot        : 0.47-v0.46.2-183-g2c92e51@rc-v0.46.3
radical.utils        : 0.47-v0.46-73-gd580ab1@rc-v0.46.3
saga                 : 0.47-v0.46-32-ga2f9ded@HEAD-detached-at-origin-rc-v0.46.3