Closed alexander-held closed 1 year ago
From the CMS ttbar notebook, slightly adapted:
if PIPELINE == "servicex_databinder": from servicex_databinder import DataBinder t0 = time.time() # query for events with at least 4 jets with 25 GeV, at least one b-tag, and exactly one electron or muon with pT > 25 GeV # returning columns required for subsequent processing query_string = """Where( lambda event: event.electron_pt.Where(lambda pT: pT > 25).Count() + event.muon_pt.Where(lambda pT: pT > 25).Count() == 1 ).Where(lambda event: event.jet_pt.Where(lambda pT: pT > 25).Count() >= 4 ).Where(lambda event: event.jet_btag.Where(lambda btag: btag > 0.5).Count() >= 1 ).Select( lambda e: {"electron_pt": e.electron_pt, "muon_pt": e.muon_pt, "jet_pt": e.jet_pt, "jet_eta": e.jet_eta, "jet_phi": e.jet_phi, "jet_mass": e.jet_mass, "jet_btag": e.jet_btag} )""" query_string = """Where( lambda event: event.electron_pt.Where(lambda pT: pT > 25).Count() + event.muon_pt.Where(lambda pT: pT > 25).Count() == 1 ).Select( lambda e: {"electron_pt": e.electron_pt, "muon_pt": e.muon_pt, "jet_pt": e.jet_pt, "jet_eta": e.jet_eta, "jet_phi": e.jet_phi, "jet_mass": e.jet_mass, "jet_btag": e.jet_btag} )""" sample_names = ["ttbar__nominal"] # 1.5 TB, 7066 files sample_list = [] for sample_name in sample_names: sample_list.append({"Name": sample_name, "RucioDID": f"user.ivukotic:user.ivukotic.{sample_name}", "Tree": "events", "FuncADL": query_string}) databinder_config = { "General": { "ServiceXBackendName": "uproot", "OutputDirectory": "outputs_databinder", "OutputFormat": "root", "IgnoreServiceXCache": SERVICEX_IGNORE_CACHE }, "Sample": sample_list } sx_db = DataBinder(databinder_config) out = sx_db.deliver() print(f"execution took {time.time() - t0:.2f} seconds")
fails for a single file. This seems to be query-related.
This may have been a transient issue.
Closing this as it has not happened since, presumably no longer a problem.
From the CMS ttbar notebook, slightly adapted:
fails for a single file. This seems to be query-related.