Closed swagata87 closed 4 months ago
A new Issue was created by @swagata87 Swagata Mukherjee.
@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
thanks to @shdutta16 and @Prasant1993 for bringing this to my notice, this came up while training new photon IDs for Run3.
FYI @cms-sw/pf-l2
@swagata87 do you see NaNs in the persisted data, or does it appear as a part of running some algorithms (e.g. in or upstream of PFCandidatePrimaryVertexSorter). Having a specific pf candidate with a nan would be more direct to trace.
assign reconstruction
New categories assigned: reconstruction
@jpata,@clacaputo,@mandrenguyen you have been requested to review this Pull request/Issue and eventually sign? Thanks
Looping over particleFlow candidates (vector<reco::PFCandidate> "particleFlow" "" "RECO"
) in those problematic events, I see that pt, eta, phi of one pf candidate is -nan
in those events. @slava77
so I guess its a general problem, not specific to the PFCandidatePrimaryVertexSorter algo
Looping over particleFlow candidates (
vector<reco::PFCandidate> "particleFlow" "" "RECO"
) in those problematic events, I see that pt, eta, phi of one pf candidate is-nan
in those events. @slava77 so I guess its a general problem, not specific to the PFCandidatePrimaryVertexSorter algo
considering that it was a pion, is there a nan in the generalTracks
collection?
yes the -nan
PF candidate in both events were pions.
I do not see any nan
in the general track pt/eta/phi in those events.
The issue seem to be in PF charged hadron.
It's also reflected in the fact that, while photon's charged hadron isolation is -nan
;
the track isolation values make sense:
In one event, both photons have:
trkSumPtSolidCone=0 trkSumPtHollowCone=0
In the other event:
photon1: trkSumPtSolidCone=17.5619 trkSumPtHollowCone=17.5619
photon2: trkSumPtSolidCone=38.5868 trkSumPtHollowCone=38.5868
assign pf
New categories assigned: pf
@kdlong,@juska you have been requested to review this Pull request/Issue and eventually sign? Thanks
As an intermediate solution, what we can do is to not consider such problematic PF candidates in isolation sum of photons. This way we won't have nan isolation for photons anymore. This does not address the root cause of why such PF candidates are reconstructed in the first place. But it's better than nothing, and serves the purpose for egamma.
right so now I checked that the issue also happen (rarely though) in data (checked in 2022C).
I don't quite like photon's charged isolation being -nan
;
so I prepared a patch to fix photon's isolation: https://github.com/cms-sw/cmssw/pull/39120
okay after a few more checks, the real issue seem to here: https://github.com/cms-sw/cmssw/blob/6d2f66057131baacc2fcbdd203588c41c885b42c/RecoParticleFlow/PFProducer/src/PFCandConnector.cc#L159-L161
before doing that division, we should check that the divisor is >0
with this patch, the issue seems to be solved:
diff --git a/RecoParticleFlow/PFProducer/src/PFCandConnector.cc b/RecoParticleFlow/PFProducer/src/PFCandConnector.cc
index 6d3d1fa9ec8..bd96361516e 100644
--- a/RecoParticleFlow/PFProducer/src/PFCandConnector.cc
+++ b/RecoParticleFlow/PFProducer/src/PFCandConnector.cc
@@ -156,9 +156,10 @@ void PFCandConnector::analyseNuclearWPrim(PFCandidateCollection& pfCand,
const math::XYZTLorentzVectorD& momentumPrim = primaryCand.p4();
- math::XYZTLorentzVectorD momentumSec;
+ math::XYZTLorentzVectorD momentumSec(0,0,0,0);
- momentumSec = momentumPrim / momentumPrim.E() * (primaryCand.ecalEnergy() + primaryCand.hcalEnergy());
+ if ( (momentumPrim.E() * (primaryCand.ecalEnergy() + primaryCand.hcalEnergy())) > 0.0 )
+ momentumSec = momentumPrim / momentumPrim.E() * (primaryCand.ecalEnergy() + primaryCand.hcalEnergy());
map<double, math::XYZTLorentzVectorD> candidatesWithTrackExcess;
map<double, math::XYZTLorentzVectorD> candidatesWithoutCalo;
I did not check the effect of the above patch on other objects (jet/met/tau etc)
@swagata87 wouldn't it be enough checking that momentumPrim.E() > 0
?
Thanks a lot @swagata87. I don't really understand how the energy can be zero here, is the p4 also fully zero? Can you share the setup you're using for testing (just rerunning reach on the RAW event you shared above)?
wouldn't it be enough checking that
momentumPrim.E() > 0
?
Hello @perrotta, you are right;
The divisor is just momentumPrim.E()
, so checking that to be >0
is sufficient to solve the issue.
I don't really understand how the energy can be zero here, is the p4 also fully zero?
yeah.. no idea why momentumPrim.E()
is zero,
actually px
, py
, pz
are all zero in the 2 events where I am checking these things.
Can you share the setup you're using for testing (just rerunning reach on the RAW event you shared above)?
yes I am running the RECO step on RAW. Below is the setup I have:
1) Run3 DATA
CMSSW_12_4_6
Config file obtained with this cmsDriver command:
cmsDriver.py --filein /store/data/Run2022C/EGamma/RAW/v1/000/355/872/00000/3b478527-d206-4b8e-8004-08e9aff7758b.root --fileout file:AOD.root --data --eventcontent AOD --runUnscheduled --customise Configuration/DataProcessing/Utils.addMonitoring --datatier AOD --conditions 124X_dataRun3_Prompt_v4 --step RAW2DIGI,RECO --geometry DB:Extended --era Run3 --python_filename aod_cfg.py --beamspot Run3RoundOptics25ns13TeVLowSigmaZ --no_exec -n -1
Added eventsToProcess = cms.untracked.VEventRange('355872:546279379-355872:546279379')
, to just run on the problematic event.
2) RelVal MC
CMSSW_12_1_0_pre4
Config file obtained with this cmsDriver command:
cmsDriver.py --filein 'file:/eos/cms/store/relval/CMSSW_12_1_0_pre4/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/121X_mcRun3_2021_realistic_v10_HighStat-v2/10000/22c2fc3b-069f-4437-ab6d-edf0a9c0dfc7.root' --fileout file:AOD.root --mc --eventcontent AODSIM --runUnscheduled --customise Configuration/DataProcessing/Utils.addMonitoring --datatier AODSIM --conditions 121X_mcRun3_2021_realistic_v10 --step RAW2DIGI,RECO --geometry DB:Extended --era Run3 --python_filename aod_cfg.py --beamspot Run3RoundOptics25ns13TeVLowSigmaZ --no_exec -n -1
Added eventsToProcess = cms.untracked.VEventRange('1:36727-1:36727')
, to just run on the problematic event.
Even though #39120 will mitigate it for photons, I could imagine there's a high likelihood a NaN PFCandidate will mess up also other things.
Therefore, @cms-sw/pf-l2 @laurenhay it might be useful to consider a fix in PF on a short timescale (or understanding if the issue comes from somewhere upstream in reco).
Hi @jpata yes I am taking a look. We seem to have candidates with 0's for kinematics before the corrections which puts them to nans. Trying to understand and fix this early in the chain, will have news later this week.
Hi all, sorry, it took me a bit longer to reproduce the error and figure out a suitable debug setup to chase down what's happening, but I think I have some useful info now.
The nan comes from the line identified by Swagata, in a step where the nuclear interactions are recombined into a single candidate. The original issue, however, is coming from a candidate with zero four momentum.
Looking at event 355872:546279379 from the file /store/data/Run2022C/EGamma/RAW/v1/000/355/872/00000/92d09274-5b85-48a0-9ed9-78c0b33ba560.root
(it's mixed up above). The track is displaced associated to this candidate (hence the nuclear interaction recovery). It has valid kinematics, but a high qoverpError wrt qoverp, which causes its momentum to be rescaled for consistency with the total calo energy this is where the zero comes in. The trackRef key is 418
, it has track.qoverpError() = 0.03167
and track.qoverp() = -0.02654
. I checked if this is fixed by the mkFit pixelless revert in CMSSW_12_4_7, but it isn't, as the track isn't from a pixelless iteration: track.algoName() = "highPtTripletStep"
.
I believe what's happening in these lines is a rescaling the kinematics of all charged hadron tracks simultaneously such that the sum of charged hadron tracks is consistent with the sum of calo energy. Because the track error of this track is so high, it's getting a negative value from the fit, and is reset to zero here: https://github.com/cms-sw/cmssw/blob/master/RecoParticleFlow/PFProducer/src/PFAlgo.cc#L2510-L2511
in general I don't think zero is a reasonable fallback, as it gives a "ghost" candidate with no kinematics or mass. I guess it would be better to remove it at that point, or at the very least make sure the mass is taken into account to avoid problems downstream. At the moment I'm not sure how technically difficult it would be to remove a candidate at this point in the code, certainly one has to be careful. On the other hand, it might make sense to exclude these tracks from getting promoted to PF cands in the first place, but I need to think/investigate more for that.
I guess this issue is happening more often now if mkFit is giving a higher fraction of displaced tracks with high uncertainty. Is it known or expected? Maybe the mkFit experts want to provide some feedback on this particular track?
I guess this issue is happening more often now if mkFit is giving a higher fraction of displaced tracks with high uncertainty. Is it known or expected? Maybe the mkFit experts want to provide some feedback on this particular track?
what is the direction of this track, and also the number of hits? mkFit tracks are on average shorter than previously in CKF; this can affect the resolution. This may also be a very forward 3-hit (actual 3-layer) track, at which point it's probably no matter if it's mkFit or CKF.
Hi Slava, here is some more info on the track
In [9]: print(track.pt(), track.eta(), track.phi(), track.p())
5.456 -2.620 -1.916 37.67
In [12]: track.numberOfValidHits()
Out[12]: 5
In [13]: track.numberOfLostHits()
Out[13]: 0
Hi Slava, here is some more info on the track
In [9]: print(track.pt(), track.eta(), track.phi(), track.p()) 5.456 -2.620 -1.916 37.67 In [12]: track.numberOfValidHits() Out[12]: 5 In [13]: track.numberOfLostHits() Out[13]: 0
also in your original post The track is displaced
How large in dxy(bs)? it's a bit odd to have a large displacement in the pixel-based iteration. The seeding region is limited to 0.02 cm.
One simple thing I don't understand, though, is why we let the rescaleMomentum() function here rescale the full four vector. Don't we really want to rescale the 3 momentum instead? If we leave these ghosts candidates we should at least leave them with a coherent mass.
https://github.com/cms-sw/cmssw/blob/5a3dd48730c2b88d67d1304788801e26e605cd36/RecoParticleFlow/PFTracking/src/PFTrackAlgoTools.cc#L259
is supposed to reject the track with dpt/pt >~ 10 * 0.06^2 ~ 0.036
... or is this track from elsewhere (NuclearInteractionTrackImporter
?)
In [12]: track.numberOfValidHits() Out[12]: 5
considering eta 2.6 and 5 hits, it would not be the shortness of the mkFit candidates. There is perhaps another feature: mkFit merges hits from overlapping seeds, assuming that the hit oulier rejection in the final fit would take care of it. In part this approach helps to increase the efficiency at high eta (compared to CKF). This may have an extra source of tails though.
is supposed to reject the track with
dpt/pt >~ 10 * 0.06^2 ~ 0.036
... or is this track from elsewhere (NuclearInteractionTrackImporter
?)
You're right, it's coming in only at the NuclearInteractionTrackImporter, which is taking tracks from PFDisplacedVertexProducer. Have already been playing with the track quality cuts here with @laurenhay for another issue. We will consider this as well. I haven't quite tracked what's going on, but when the track is accessed here, it refers to refitted kinematics. I'm not sure what the interplay of CKF (which I guess would be used for a refit) and mikfit is then.
Looking into things in more detail, I see that turning PFCands to ghosts is used in other places in PF (e.g., the post cleaning, so probably the step that turns this into a 0 momentum candidate is fine. I still tend to think that it would be more correct to scale the 3 momentum instead of the four momentum. I implemented a patch (here)[https://github.com/kdlong/cmssw/commit/b84a8843e1aa00b4041533804967c92e70287734] to the PFCandidate rescaleMomentum function to do this, and it fixes the divide by zero problem here without introducing any technical issues that I see. It also seems to me that rescaleMomentum is not used almost anywhere else, so I think this is safe. Do others have an opinion on this, though? Maybe @bendavid @hatakeyamak?
I'm not sure what the interplay of CKF (which I guess would be used for a refit) and mikfit is then.
there should be none: the final track fits are done with CKF in full tracking and in refits. mkFit is used only for the hit pattern recognition now.
But won't it depend on the hits that mkfit has decided are part of the track?
But won't it depend on the hits that mkfit has decided are part of the track?
Perhaps I misunderstood the initial concern.
My response implying no interplay was referring to using tracks directly vs refitted tracks, since IIUC, the refitter does not change the hit content.
Before digging more into CKF vs mkFit, is it obvious that the issue is not e.g. in misalignment or some other sources of mis/displaced hits in this data?
Perhaps a check with --era Run3_noMkFit
would help, although this is a bit random and may need more than just one event reprocessing.
Before digging more into CKF vs mkFit, is it obvious that the issue is not e.g. in misalignment or some other sources of mis/displaced hits in this data?
I see it happens also in MC (see description: https://github.com/cms-sw/cmssw/issues/39110#issue-1344317299)
root://xrootd-cms.infn.it//store/mc/Run3Winter22DR/GJet_Pt-10to40_DoubleEMEnriched_TuneCP5_13p6TeV_pythia8/AODSIM/FlatPU0to70_122X_mcRun3_2021_realistic_v9-v2/2430000/6cd37543-62ec-4f62-9fa9-23b7c66f9c20.root
where the concern of misplaced hits is somewhat attenuated. If we had raw data it would be relatively trivial to reprocess with ideal conditions to check if that's the source of the issue (but as far as I understand we don't have it)
Hi @slava77: I don't really mean to suggest this is an mkfit "problem." Clearly the situation that leads to a nan is a PF issue and should be fixed at the PF level. But there has been no protection for this condition since forever, and I assume people would have noticed nan PF candidates, so the logical assumption seems to be that the switch to mkfit is the source of the difference. I actually think the current treatment in PF to effectively zero out this cand in comparison to the calo energy is working well enough, we just need a protection for the divide by zero. But it would be interesting to understand why this is seemingly more common now.
@mmusich thanks, you're right, it's happening in MC. What do you mean about not having RAW data? The file I'm looking at is a RAW file.
A do you mean we don't have the RAW datatier for MC?
What do you mean about not having RAW data? The file I'm looking at is a RAW file.
I mean that I understood from a question asked from @swagata87 in this mattermost channel that we don't have the parent PDs for this dataset (including some data-tier that we can re-reconstruct)
But there has been no protection for this condition since forever, and I assume people would have noticed nan PF candidates, so the logical assumption seems to be that the switch to mkfit is the source of the difference.
is the calorimeter response or calibration possibly something that can lead to a change in behavior (or did that not change since forever?)
The calo calibrations changed, but this is a pretty small effect, and the response has been hardcoded for forever. The zeroing happens based on a sum of calo energy in a cluster compared to the sum of the track momentum in the cluster. The only thing that seems "special" to me is that the track momentum uncertainty is very high, so it's best fit value for rescaling is zero. Some output (coming from here):
Compare Calo Energy to total charged momentum sum p = 69.5542 +- 45.4575 sum ecal = 0 sum hcal = 19.504 => Calo Energy = 17.828»
case 1: COMPATIBLE |Calo Energy- total charged momentum| = 51.7256 < 1.4988 x 46.5869
case 1: COMPATIBLE |Calo Energy- total charged momentum| = 51.7256 < 1.4988 x 46.5869
max DP/P = 1.19337 > 0.1: take weighted averages
max DP/P = 1.19337 > 0.1: take weighted averages
track associated to hcal 0 P = 37.6768 +- 44.9625
track associated to hcal 0 P = 37.6768 +- 44.9625
track associated to hcal 1 P = 21.2718 +- 6.66727
track associated to hcal 1 P = 21.2718 +- 6.66727
track associated to hcal 2 P = 4.27024 +- 0.553705
track associated to hcal 2 P = 4.27024 +- 0.553705
track associated to hcal 3 P = 1.11607 +- 0.0169243
track associated to hcal 3 P = 1.11607 +- 0.0169243
track associated to hcal 4 P = 1.32559 +- 0.0322203
track associated to hcal 4 P = 1.32559 +- 0.0322203
track associated to hcal 5 P = 1.08917 +- 0.0104476
track associated to hcal 5 P = 1.08917 +- 0.0104476
track associated to hcal 6 P = 0.381433 +- 0.00804561
track associated to hcal 6 P = 0.381433 +- 0.00804561
track associated to hcal 7 P = 1.27163 +- 0.0220997
track associated to hcal 7 P = 1.27163 +- 0.0220997
track associated to hcal 8 P = 0.449754 +- 0.00472474
track associated to hcal 8 P = 0.449754 +- 0.00472474
track associated to hcal 9 P = 0.366177 +- 0.00426354
track associated to hcal 9 P = 0.366177 +- 0.00426354
track associated to hcal 10 P = 0.33557 +- 0.00407122
track associated to hcal 10 P = 0.33557 +- 0.00407122
old p 37.6768 new p -10.5045 rescale 0
old p 37.6768 new p -10.5045 rescale 0
old p 21.2718 new p 20.2124 rescale 0.950195
old p 21.2718 new p 20.2124 rescale 0.950195
old p 4.27024 new p 4.26293 rescale 0.998289
old p 4.27024 new p 4.26293 rescale 0.998289
What do you mean about not having RAW data? The file I'm looking at is a RAW file.
I mean that I understood from a question asked from @swagata87 in this mattermost channel that we don't have the parent PDs for this dataset (including some data-tier that we can re-reconstruct)
yes the parent dataset of GJet 122X MC is not available.
There is a relVal MC where the issue was seen, for that we have the RAW,
not sure if it is useful, but this is the raw: /eos/cms/store/relval/CMSSW_12_1_0_pre4/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/121X_mcRun3_2021_realistic_v10_HighStat-v2/10000/22c2fc3b-069f-4437-ab6d-edf0a9c0dfc7.root
(https://github.com/cms-sw/cmssw/issues/39110#issuecomment-1223779784)
There is a relVal MC where the issue was seen, for that we have the RAW, not sure if it is useful, but this is the raw: /eos/cms/store/relval/CMSSW_12_1_0_pre4/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/121X_mcRun3_2021_realistic_v10_HighStat-v2/10000/22c2fc3b-069f-4437-ab6d-edf0a9c0dfc7.root
thanks @swagata87 it is useful.
using the setup at https://github.com/cms-sw/cmssw/issues/39110#issuecomment-1223779784
but changing the global tag to 121X_mcRun3_2021_design_v9
(including ideal alignment / calibrations) the warning
%MSG-w PrimaryVertexSorting: PFCandidatePrimaryVertexSorter:primaryVertexAssociationJME 05-Sep-2022 12:18:59 CEST Run: 1 Event: 36727
Scaling is NAN ignoring this candidate/track
%MSG
is gone. So it seems it's a feature of the calibrations used in the realistic GT. Addin @cms-sw/trk-dpg-l2 to the thread.
can someone of you prepare a list of the DetIds of the hits associated to the problematic track in the MC event?
can someone of you prepare a list of the DetIds of the hits associated to the problematic track in the MC event?
@mmusich was this for me or for the dpg conveners? I spent some time messing with it, The trackRef key = 3 and ID = 1463 in the AOD that I produce from the RAW, but tracks.recHits() hasn't been stored. Can you give me the keep statement to store all the tracks so this works?
was this for me or for the dpg conveners?
to anyone who can devote some some to this :)
I spent some time messing with it, The trackRef key = 3 and ID = 1463 in the AOD that I produce from the RAW
thanks, this is already helping.
Can you give me the keep statement to store all the tracks so this works?
I think that if you retain the RECO
data-tier, instead of AOD
you will have all the info needed.
Thanks, good point. I'm still not 100% sure about the interface, if I do track.recHit(i) I get a SiPixelRecHit which always returns null from its det function. I can give you the collection, keys, and raw ID from these objects, though:
for i in range(track.recHitsSize()):
rh = track.recHit(i)
print(rh.id().id(), rh.key(), rh.get().rawId())
gives
454 40 303063048
454 41 344241156
454 42 344242180
454 43 344503300
454 44 344504324
454 45 344765444
454 46 344766468
Thanks, good point. I'm still not 100% sure about the interface, if I do track.recHit(i) I get a SiPixelRecHit which always returns null from its det function.
can you try to print details from the reco job itself?
@slava77 not immediately sure where I should add the printouts, probably someone from the tracking POG would be able to do this quicker than me.
Addressed from the PF side in #39368
type pf
I noticed that one of the algorithms (that I was running) was quitting with the error that the charged hadron isolation had NaN value, while running it over Run 3 GJet samples. This is actually when I noticed this issue. When I had run the same algorithm on Run 2 GJet samples earlier, this did not crop up. So, most likely Run 2 samples did not have this issue.
I add a further look to this:
454 40 303063048 454 41 344241156 454 42 344242180 454 43 344503300 454 44 344504324 454 45 344765444 454 46 344766468
these are all Pixel DetIds:
namely it's a track with 1 hit in BPix 1 and 6 hits on side 1 of FPix (suggesting it's very forward).
is it obvious that the issue is not e.g. in misalignment or some other sources of mis/displaced hits in this data?
Comparing the rigid body alignment parameters (3 cartesian coordinates of the sensor active area center and corresponding Euler angles) of these DetId
s between the "realistic" misaligned geometry and the "ideal" one I don't see exceedingly worrysome deviations:
albeit there's one DetId
of FPix that has somewhat high global-z
displacement (around 60um), considering the RMS of the whole set is about 20 μm:
On the other hand I have seen that using this overridden set of conditions on top of the recipe at https://github.com/cms-sw/cmssw/issues/39110#issuecomment-1236821688
from Configuration.AlCa.GlobalTag import GlobalTag
process.GlobalTag = GlobalTag(process.GlobalTag, '121X_mcRun3_2021_realistic_v10', '')
process.GlobalTag.toGet = cms.VPSet(
cms.PSet(record = cms.string("TrackerSurfaceDeformationRcd"),
tag = cms.string("TrackerSurfaceDeformations_zero")
)
)
is sufficient to remove the warning Scaling is NAN ignoring this candidate/track
.
So it looks like it is actually caused by the non-rigid body degrees of freedom.
+1
@cms-sw/reconstruction-l2 : Is there something pending to close this issue?
This is to keep track of (and hopefully solve at some point) an issue that reco::Candidate pt is
-nan
in some rare cases. It seems that this problem can lead to egamma object's PF isolation being-nan
as well, if the problematic candidate ends up in egamma object's isolation cone.One such problematic event is in this AOD file:
root://xrootd-cms.infn.it//store/mc/Run3Winter22DR/GJet_Pt-10to40_DoubleEMEnriched_TuneCP5_13p6TeV_pythia8/AODSIM/FlatPU0to70_122X_mcRun3_2021_realistic_v9-v2/2430000/6cd37543-62ec-4f62-9fa9-23b7c66f9c20.root
, the exact run:event number is:eventsToProcess = cms.untracked.VEventRange('1:78326956-1:78326956'),
If we run AOD->MiniAOD step (I ran in CMSSW_12_2_1) for this event, we get the following warning in the event
triggered from here: https://github.com/cms-sw/cmssw/blob/6d2f66057131baacc2fcbdd203588c41c885b42c/CommonTools/RecoAlgos/src/PrimaryVertexSorting.cc#L35-L36
I checked that, in that loop, c->pt() =
-nan
, and the pdgId is-211
. In this event, we have 2 reconstructed gedPhotons, both havechargedHadronIso()=-nan
.Another example of such problematic event is in this GEN-SIM-DIGI-RAW file:
/eos/cms/store/relval/CMSSW_12_1_0_pre4/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/121X_mcRun3_2021_realistic_v10_HighStat-v2/10000/22c2fc3b-069f-4437-ab6d-edf0a9c0dfc7.root
, exact event is:eventsToProcess = cms.untracked.VEventRange('1:36727-1:36727')
Running the RECO step (in CMSSW_12_1_0_pre4) on this event, we get the same warning
In this case, reco::Candidate pt = -nan and its pdgId is 211. And the event has 2 reconstructed ged photons with -nan value of charged hadron isolation.
Next I plan to check if the issue of photon's charged hadron isolation being
-nan
happen in data also or not.