puhep / pudb

Purdue CMS FPix Database
0 stars 0 forks source link

MoreWeb upload not working #160

Closed rbartek closed 8 years ago

rbartek commented 8 years ago

moreweb tar files not appearing in database for M-K-2-05, M-J-4-18, M-K-1-18, and M-K-1-24 (friday) and M-K-1-09, M-I-4-30, and M-J-1-20 (thursday)

tar files for M-L-2-05 were tested and work with bare moreweb

gneeser commented 8 years ago

The test files for M-J-1-20, M-I-4-30, M-K-1-09 are incomplete. They're missing the configfiles and logfiles folders. This is usually, in my experience, caused by the test being exited early.

In any case, this caused MoReWeb to throw an exception and stop running.

I've moved the files and MoReWeb seems to be processing the rest (it's currently processing M-K-1-24).

The modules that have incomplete files will likely have to be retested so that the proper configuration can be generated by elComandante. Of course, if you know how to do it manually that would be a way too, but I for sure don't know how to do that.

Let me know if the modules don't appear within the next hour or so.

-Greg

jstupak commented 8 years ago

Did you make these tarballs by hand, or they were created by elComandante?

lantone commented 8 years ago

ah yeah, john i think you cracked the case. we did make tars by hand last thursday because elcomandante didn't exit correctly. i think the log files must only get dumped when elcomandante finishes so they're missing from our homemade tar files. bummer. i looked around on the UNL machine, but couldn't find the log files anywhere. so yeah, the conclusion remains, those modules will need to be retested. lesson learned: if elcomandante doesn't quit nicely, gotta rerun!

jstupak commented 8 years ago

yea, the log files are made at the very end of elComandante.py, when you tell elComandante to gracefully shut down. I don't think there is an easy way to produce them in the event you kill elComandante.

@gneeser Do I understand correctly that bc there was a bad tarball uploaded, new (good) tarballs were not being processed?

gneeser commented 8 years ago

It looks that way, yes. In the log, there was an exception raised complaining that the directory didn't exist, and nothing after that. I was somewhat surprised because in the past, I had thought it simply skipped those directories.

jstupak commented 8 years ago

@lantone @gfunk723

Is it possible to protect against this in case it happens again?

lantone commented 8 years ago

indeed it is! it was a bug, looks like it's been patched in the master branch of the psi46 repo. I just copied it to our fnal/MoReWeb version and pushed it (https://github.com/fnal/MoReWeb/commit/8e87ed0519ccc5ad3605426c0a3c90f82710f129)

@gneeser , is it easy to pull that into your local repo?

jstupak commented 8 years ago

Cool, thanks!

drberry85 commented 8 years ago

Can we close this since MoReWeb is up and running?

gneeser commented 8 years ago

Yes, I think we can. I probably left it open until we could do the update and forgot to close it once we did that update a while back.

-Greg