GRIFFINCollaboration / GRSISort

A lean, mean, sorting machine.
MIT License
22 stars 54 forks source link

Where's SPICE gone #532

Closed jsmallcombe closed 8 years ago

jsmallcombe commented 8 years ago

Trying to sort SPICE data and it's producing corrupt analysis trees

GRSI [0] new TBrowser Created a GRootCanvas. GCanvasInit called. (class TBrowser_)0x17594d0 GRSI [1] Error in TFile::ReadBuffer: error reading all requested bytes from file analysis35661_000.root, got 240 of 300 Error in TFile::Init: analysis35661000.root failed to read the file type data. (class TFile)0x1648cd0

Also during the file construction SPICE and S3 arent listed, where they were before.

Any thoughts? I have done a hard reset to the master.

Attempting to make analysis trees. Closing after Sort. Found 1 trees with 20263829 total fragments. Sorting tree[1]: fragment35651_000.root starting TGRSIRunInfo Status: RunNumber: 35651 SubRunNumber: 000 RunStart: 1440820354 RunStop: 1440821194 RunLength: 840 TIGRESS: true SHARC: false TRIFOIL: false TIP: false CSM: false GRIFFIN: false SCEPTAR: false PACES: false DESCANT: false

Build Window   = 200
Moving Window  = TRUE
AddBack Window = 15.0
Array Position = 110

==============================

parsed 10944 lines. Successfully read 912 TChannels from File AnalysisTreeBuilder: read in 912 TChannels. Tree Index not found, building index on TriggerId/FragmentId... done! MEM_SIZE = 17179869184 created output file: analysis35651_000.root created AnalysisTree terminate called after throwing an instance of 'std::out_of_range' what(): vector::_M_range_check

r3dunlop commented 8 years ago

Likely just not being written to the tree. Can you point me in the direction of a Midas file so that I can try? Thanks for testing this stuff out by the way I'm expecting quite a few bugs on the tigress side at this point.

On Oct 21, 2015, at 2:54 AM, jsmallcombe notifications@github.com wrote:

Trying to sort SPICE data and it's producing corrupt analysis trees

GRSI [0] new TBrowser Created a GRootCanvas. GCanvasInit called. (class TBrowser)0x17594d0 GRSI [1] Error in TFile::ReadBuffer: error reading all requested bytes from file analysis35661_000.root, got 240 of 300 Error in TFile::Init: analysis35661_000.root failed to read the file type data. (class TFile)0x1648cd0

Also during the file construction SPICE and S3 arent listed, where they were before.

Any thoughts? I have done a hard reset to the master.

Attempting to make analysis trees. Closing after Sort. Found 1 trees with 20263829 total fragments. Sorting tree[1]: fragment35651_000.root starting TGRSIRunInfo Status: RunNumber: 35651 SubRunNumber: 000 RunStart: 1440820354 RunStop: 1440821194 RunLength: 840 TIGRESS: true SHARC: false TRIFOIL: false TIP: false CSM: false GRIFFIN: false SCEPTAR: false PACES: false DESCANT: false

Build Window = 200 Moving Window = TRUE AddBack Window = 15.0 Array Position = 110

parsed 10944 lines. Successfully read 912 TChannels from File AnalysisTreeBuilder: read in 912 TChannels. Tree Index not found, building index on TriggerId/FragmentId... done! MEM_SIZE = 17179869184 created output file: analysis35651_000.root created AnalysisTree terminate called after throwing an instance of 'std::out_of_range' what(): vector::_M_range_check

— Reply to this email directly or view it on GitHub.

r3dunlop commented 8 years ago

I have fixed the RunInfo, it just wasn't being checked in that list. I'm not sure which vector is going out of range, but it might be because I blanketed the non-lean things into TDetector rather than TGRSIHit. I will continue looking at it.

r3dunlop commented 8 years ago

As a heads up I have changed the "GetHit" functions that you guys were using to "GetSiLiHit" as we will eventually move these over to TGRSIDetectorHits and the GetHit returns a base TGRSIDetectorHit with not all of the same functionality of the SiLiHit. I have also added error handling on the hit vectors so maybe that will give some more detail on where it's failing.

r3dunlop commented 8 years ago

This could also be an issue with Tigress. It was switched to a clones array and I am unsure of how much that was tested.

pcbend commented 8 years ago

It was tested as a clones array. It should be OK. If I can find some free time I will try to push a more comprehensive tigress this week. On Oct 21, 2015 9:48 AM, "Ryan Dunlop" notifications@github.com wrote:

This could also be an issue with Tigress. It was switched to a clones array and I am unsure of how much that was tested.

— Reply to this email directly or view it on GitHub https://github.com/GRIFFINCollaboration/GRSISort/issues/532#issuecomment-149901613 .

jsmallcombe commented 8 years ago

Data is in midtig02_data/tiguser/SPICETestAnalysis/SPICETestAugust2015 That was for run35661_000.mid

I cant compile the current, same issue as Henderson in the next thread.

We are trying to do development of analysis for both Tigress+Tip and Tigress+Spice at this end, obviously we don't want to work on incredibly out of date version, so do we have an ETA on when we can work on these detector classes without them being re-written every 5 minutes?

r3dunlop commented 8 years ago

The detector classes are, for the most part, not going to change anymore. The leangrsi side of things was evolving quickly because we did an overhaul. The detector classes shouldn’t have been touched on the master branch for the last 4 or 5 months.

Things should not bother you at least to the point that you can’t develop. As long as you write code that interfaces into the class through functions and you don’t touch the actual data members, we shouldn’t have a problem with you developing anything. If you are talking about doing actual analysis sorting, it should probably be exterior from those classes anyway which requires interfacing, and shouldn’t affect your code. There could be changes in the very near future with Tigress/sharc/tip/Sili/S3 as we make these leaner, but those changes won’t affect code, they will only affect a written root file so it might require a quick fragment->analysis rebuild which are pretty quick with the lean version. What level of development are you talking about? Can you give me a list of detector systems you are working on and the priority in which they should be fixed?

On Oct 21, 2015, at 2:21 PM, jsmallcombe notifications@github.com wrote:

Data is in midtig02_data/tiguser/SPICETestAnalysis/Sdata/SPICETestAugust2015 That was for run35661_000.mid

I can compile the currently, same issue as Henderson in the next thread.

We are trying to do development of analysis for both Tigress+Tip and Tigress+Spice at this end, obviously we don't want to work on incredibly out of date version, so do we have an ETA on when we can work on these detector classes without them being re-written every 5 minutes?

— Reply to this email directly or view it on GitHub https://github.com/GRIFFINCollaboration/GRSISort/issues/532#issuecomment-149984011.

r3dunlop commented 8 years ago

How is this problem looking now? Still there?

jsmallcombe commented 8 years ago

"For the most part" is not overly reassuring. 4 months? You changed TSiLi today!

We don't just want to interface with the code we want to bloody edit it, these are our detectors and we want to be able to add and change data members, which we aren't going to do while you mess about with underlying classes.

TTigress, TTip, TSiLi, TS3 and TRF at the moment. We are looking at adding our own wave function fitting algorithms at the Fragment->Detector stage.

As for the sorting problem: Spice and S3 are at least in the list now, but the vector error

terminate called after throwing an instance of 'std::out_of_range' what(): vector::_M_range_check

Remains and the output files are still unreadable.

pcbend commented 8 years ago

Adding functionality to fit waveforms is completely in line with the way things are set up. In fact, this is already done for the RF where the phase is extracted by fitting a waveform stored in the fragment. Similar, the TTriFoil class does a very primitive wave form analysis to set the "tbeam' variables. This is nothing new. I have talked with Aaron about adding this to tip as well but as there is little evidence that SFU will switch there analysis I have never felt the pressure to put it in.

You absolutely do not want to change data members on the fly. It will break the analysis for everyone else. The code as it stands is provides a frame work to get to the data. You are more than welcome to clone it and change it as you wish, however you should fork it and do it on your own branch. The farther you drift from what people are trying to provide, the more on your own you will be. I highly encourage you to contribute by submitting ideas and/or code updates to the main repo. To be fair, Ryan did not change the TSiLi class today, he simply brought it back after a complicated merge. He is volunteering his time to take care of this repo which we gave to him because we thought (still think) that he is doing an excellent job. It wasn't so long ago that such a frame work did not exists for anyone at TRIUMF, believe me this problems are minuet compared to the issues just a few years back.

Now, the error you post is because something is trying to call a hit which does not exists. If you tell me a file and computer I can take a closer look.

On Wed, Oct 21, 2015 at 6:16 PM, jsmallcombe notifications@github.com wrote:

"For the most part" is not overly reassuring. 4 months? You changed TSiLi today!

We don't just want to interface with the code we want to bloody edit it, these are our detectors and we want to be able to add and change data members, which we aren't going to do while you bugger about with underlying classes.

TTigress, TTip, TSiLi, TS3 and TRF at the moment. We are looking at adding our own wave function fitting algorithms at the Fragment->Detector stage.

As for the sorting problem: Spice and S3 are at least in the list now, but the vector error

terminate called after throwing an instance of 'std::out_of_range' what(): vector::_M_range_check

Remains and the output files are still unreadable.

— Reply to this email directly or view it on GitHub https://github.com/GRIFFINCollaboration/GRSISort/issues/532#issuecomment-150040340 .

jsmallcombe commented 8 years ago

I'm a fairly bad coder but I'm pretty sure I can add a data member to TSiLiHit without the world exploding. If you are so worried about people breaking the code when making changes then more documentation on the code flow and style is required. Tell us how to add things that won't break the code, dont tell us not to add things to the code.

There are enough of us in the tigress group that want the SFU code implemented that we are going to do it. We dont need your permission, nor you to do it for us. We just need a stable framework and instructions on what not to break.

Data is midtig02 data/tiguser/SPICETestAnalysis/SPICETestAugust2015/run35661_000.mid I am currently working on smilodon

pcbend commented 8 years ago

No one told you not to add things. What I said was to fork it and add what you want. When you fork it you have your own personal copy where you can added what you want, where you want, when you want. The issue to adding things to the code as I see it, is it can physically change on going analysis for people. I find this to be very bad thing.

Yes, I think the sfu stuff should be in. However you should really check what has already been done. It's all there made freely available to you and everyone else. Fortunately for all those doing analysis with the code, you do need someone with ownership and know how before anything will make it into the main repo. This is not to prevent you from adding things or praticipating but to keep the broken to a minimum. Again, on your own fork you can go nuts and add every future or class you can come up with. If you think it is good when it is done, push it upstream and let others look at it. That is how collaborative coding works.

On Wed, Oct 21, 2015 at 7:01 PM, jsmallcombe notifications@github.com wrote:

I'm a fairly bad coder but I'm pretty sure I can add a data member to TSiLiHit without the world exploding. If you are so worried about people breaking the code when making changes then more documentation on the code flow and style is required. Tell us how to add things that won't break the code, dont tell us not to add things to the code.

There are enough of us in the tigress group that want the SFU code implemented that we are going to do it. We dont need your permission, nor you to do it for us. We just need a stable framework and instructions on what not to break.

Data is midtig02 data/tiguser/SPICETestAnalysis/SPICETestAugust2015/run35661_000.mid I am currently working on smilodon

— Reply to this email directly or view it on GitHub https://github.com/GRIFFINCollaboration/GRSISort/issues/532#issuecomment-150048328 .

r3dunlop commented 8 years ago

The point is that there are base data member that exist for a detector system. Charge, timestamp, etc. You can set those with Setters, you can get those with getter functions. If you are adding variables that are not base variables, you need to add them as transient members (much like how we do with energy). That way nothing breaks in the root files, and things run much more quickly. That is what I mean by interfacing, your function shouldn’t have to access the non transient data members.

So your “I'm a fairly bad coder but I'm pretty sure I can add a data member to TSiLiHit without the world exploding. If you are so worried about people breaking the code when making changes then more documentation on the code flow and style is required. Tell us how to add things that won't break the code, dont tell us not to add things to the code.” The answer is, you must be extremely careful how you add non-transient data members (in any program!) because it changes I/O.

I tagged the old version as well, and I said the master would not have bugs on the tigress side that would need testing and ironing out. If this is causing problems, please feel free to branch off the old working tag. I brought this back in line because git work had stalled on the master anyway and once the bugs are ironed out this code will be much better.

On Oct 21, 2015, at 7:01 PM, jsmallcombe notifications@github.com wrote:

I'm a fairly bad coder but I'm pretty sure I can add a data member to TSiLiHit without the world exploding. If you are so worried about people breaking the code when making changes then more documentation on the code flow and style is required. Tell us how to add things that won't break the code, dont tell us not to add things to the code.

There are enough of us in the tigress group that want the SFU code implemented that we are going to do it. We dont need your permission, nor you to do it for us. We just need a stable framework and instructions on what not to break.

Data is midtig02 data/tiguser/SPICETestAnalysis/SPICETestAugust2015/run35661_000.mid I am currently working on smilodon

— Reply to this email directly or view it on GitHub https://github.com/GRIFFINCollaboration/GRSISort/issues/532#issuecomment-150048328.

jsmallcombe commented 8 years ago

I have a fork. I'm not a total idiot. Cheers.

Despite the pain involved we do actually want to be able to be collaborative and merge our changes so wandering off on our own when significant changes are being made to the master isn't a great plan.

r3dunlop commented 8 years ago

Here’s the deal: We had not touched the master for a long time. When we bring in a new version of the code there will be bugs. People need to test to find those bugs. If you don’t like that it’s broken, go back to an old commit, it is very simple to pick your commit. Bugs always occur when big changes happen. You can wait until they are ironed out before moving up to the newest commit. If you want to develop you can, and your changes can be merged, you don’t always have to be on the newest commit to do so.

As long as you don’t directly touch the existing non-transient (transient ends in //! in the header) data members (there shouldn’t be a reason to as they are the absolute basic, should never change, straight out of the data stream variables), then it doesn’t matter if you code on an old version and then bring it into the new version. That is the point of writing encapsulated code. Otherwise, we would always be waiting on someone to finish what they want. If you want to directly add data members, go for it, but you will definitely be slowing the ROOT IO down https://github.com/GRIFFINCollaboration/GRSISort/issues/312 https://github.com/GRIFFINCollaboration/GRSISort/issues/312. I put in a lot of time figuring this issue out, and it does indeed make a very large change in the amount of resources consumed.

On Oct 21, 2015, at 7:25 PM, jsmallcombe notifications@github.com wrote:

I have a damn fork. I'm not a total fuckwit. Cheers.

Despite the pain involved we do actually want to be able to be collaborative and merge our changes so wandering off on our own when significant changes are being made to the master isn't a great fucking plan.

— Reply to this email directly or view it on GitHub https://github.com/GRIFFINCollaboration/GRSISort/issues/532#issuecomment-150052835.

r3dunlop commented 8 years ago

If you check out the 2.3.23 version, you will be at where the code was last Thursday. That code had not been updated for a month. The current version of master was leangrsi merged with that master version, which was sure to cause bugs because they had been diverged for 6 months. If you had not been told the fact that the master has moved beyond 2.3.23, and 2.3.23 was instead still called master, I'm not sure we would even be having this discussion. This move was to get Tigress in a state where it worked much much better than it does now. At some level I feel like you are attacking us for trying to enhance code that you do not have to use at this point, and we can let you know when it is bug free.

Checking out the old version and developing there is really not a huge issue as we have already made all of merge conflicting decisions. If you want to be working on the most up-to-date code, you always have to merge something, and if we encapsulate the code properly we shouldn't have to worry about updating different aspects of the same class at the same time. This is because if I need the energy for example, using GetEnergy should return the energy, and I really shouldn't care how it became the energy. Using fEnergy, on the other hand, may not work because of the details on how fEnergy actually becomes energy. We have done this quite often on other classes in the code (TPPG, TScaler for example) This is what I meant by use "interfacing functions" and not the direct members.

To be honest, there isn't a lot of documentation on the innards of the code because we are low on the number of people actively developing said innards. Most of this is to get code out there for people to be able to do their own analyses (myself included). I have spent some time writing documentation to be able use the code and analysis classes, but unfortunately, we just do not have the man power to also document how the code behaves in the background.

I'll be in Vancouver starting Thursday through Sunday. If you want to sit down with Vinzenz and I we can plan how to attack this with the most efficiency. I can lay out the way the detector classes work etc. We will be in the Griffin counting room. Until then, if you take a look at the way Griffin, Sceptar and Descant are coded, it should be able to give you a good idea for how we want Tigress et al to look (waveform processing included). It should give you an idea of what I mean by minimally touching the non-transient members, and not including more members than necessary.

r3dunlop commented 8 years ago

I just changed something to do with the RF (even though run info says it’s not there) and it ran all the way through. I’m not sure if this is an intermittent problem or not. Is there supposed to be an RF in these data?

On Oct 21, 2015, at 7:36 PM, Ryan Dunlop rdunlop@uoguelph.ca wrote:

Here’s the deal: We had not touched the master for a long time. When we bring in a new version of the code there will be bugs. People need to test to find those bugs. If you don’t like that it’s broken, go back to an old commit, it is very simple to pick your commit. Bugs always occur when big changes happen. You can wait until they are ironed out before moving up to the newest commit. If you want to develop you can, and your changes can be merged, you don’t always have to be on the newest commit to do so.

As long as you don’t directly touch the existing non-transient (transient ends in //! in the header) data members (there shouldn’t be a reason to as they are the absolute basic, should never change, straight out of the data stream variables), then it doesn’t matter if you code on an old version and then bring it into the new version. That is the point of writing encapsulated code. Otherwise, we would always be waiting on someone to finish what they want. If you want to directly add data members, go for it, but you will definitely be slowing the ROOT IO down https://github.com/GRIFFINCollaboration/GRSISort/issues/312 https://github.com/GRIFFINCollaboration/GRSISort/issues/312. I put in a lot of time figuring this issue out, and it does indeed make a very large change in the amount of resources consumed.

On Oct 21, 2015, at 7:25 PM, jsmallcombe <notifications@github.com mailto:notifications@github.com> wrote:

I have a damn fork. I'm not a total fuckwit. Cheers.

Despite the pain involved we do actually want to be able to be collaborative and merge our changes so wandering off on our own when significant changes are being made to the master isn't a great fucking plan.

— Reply to this email directly or view it on GitHub https://github.com/GRIFFINCollaboration/GRSISort/issues/532#issuecomment-150052835.

jhenderson88 commented 8 years ago

There should be RF in all of the data from the SPICE experiment I believe.

r3dunlop commented 8 years ago

So I definitely got this problem the first time I tried to run the code on this file in Guelph. However, every time since then I have not received the error. This makes me think that it is an initialization problem in one of the Detector classes. RF tells me it doesn't exits, so I'll look into that problem as well.

jhenderson88 commented 8 years ago

I am working with some RF data from the recent TIP experiment and the RF seems to be there. I will check what I see for the SPICE data now.

r3dunlop commented 8 years ago

It’s actually frustrating that it failed first try and I can’t get it to fail anymore.

On Oct 22, 2015, at 1:26 PM, jhenderson88 notifications@github.com wrote:

I am working with some RF data from the recent TIP experiment and the RF seems to be there. I will check what I see for the SPICE data now.

— Reply to this email directly or view it on GitHub https://github.com/GRIFFINCollaboration/GRSISort/issues/532#issuecomment-150297254.

jhenderson88 commented 8 years ago

This is the vector out of range error? I fixed that by re-adding the GetTime() function to TTigress (probably should've clarified that this was why it was necessary, woops).

r3dunlop commented 8 years ago

Haha ok. well I'm glad to hear it’s working now. We’ll try to get the detector classes nice, clean and stable in the next week.

On Oct 22, 2015, at 1:28 PM, jhenderson88 notifications@github.com wrote:

This is the vector out of range error? I fixed that by re-adding the GetTime() function to TTigress (probably should've clarified that this was why it was necessary, woops).

— Reply to this email directly or view it on GitHub https://github.com/GRIFFINCollaboration/GRSISort/issues/532#issuecomment-150297756.