HEP-KBFI / tth-htt

code and python config files for ttH, H -> tautau analysis with matrix element techniques
3 stars 10 forks source link

Migrate from Hadoop to ceph #183

Closed ktht closed 1 year ago

ktht commented 1 year ago

At the very least implies that we update sample dictionaries and clean up the code from hdfs modules and commands.

saswatinandan commented 1 year ago

Are you creating new dictionaries? I need to submit new jobs asap

ktht commented 1 year ago

I'll look into this. ETA by tomorrow. We need to double-check with our sysadmin that we can proceed with the job submission, though.

ktht commented 1 year ago

AFAICT, everything should be now using the Ceph paths instead of the old Hadoop paths and modules. Just to reiterate what I had written in the commit messages:

I still kept HDFS ROOT plugin, hdfs python module and TFileOpenWrapper for historical reasons since this repository will be deprecated anyways once the bbWW analysis has concluded.

Reading from and writing to /local works as intended and the FW is able to figure out the correct input paths, so at least something is guaranteed to function at this point in time.

Also note that the same best practices apply to Ceph as they applied to Hadoop:

There are no special commands (other than those listed here), which enable to bypass FUSE (since there is no FUSE to begin with) or have otherwise a more "direct" access to the underlying file system. Use the standard POSIX commands instead.

Closing this issue with a disclaimer that not everything has been fully tested, plus we need to coordinate with our sysadmin before we can submit anything substantial to the cluster.