Closed rismoney closed 1 year ago
I ran it on 6.6.0-13 and I am seeing same issue. Upgraded to TF 1.x too, and same thing. It'sabout 1/4 times executing.
Hi @rismoney ,
Will look into this and get back to you.
When the server profile was created what were the hardware name, enclosure group name? Can you share the tfstate file?
It seems you;re dynamically providing the values for hardware and enclosure group name, can you check that while you are creating/planning the resource it is giving the same values everytime?
The module always provides a single static name, despite the variable use. The variable values never change between runs, and it always uses the default, and isn't specified. It's there more as a convention of easy copy/paste.
My folder directories are divided by enclosure groups. So folder EG06 never has any references to any servers in EG05. The state file for this directory never has any mentions to EG05. Yet somehow when the provider tries compare the resource to oneview and it returns bad data. As you see above, it returned an EG05 node and is trying to make it EG06. "EG_05" -> "EG_06" is an impossibility in my tf codebase.
But it only happens randomly. So if I run terraform plan, it works. Run it again without changing anything it works. Again, doesn't. Again doesn't. Then it does. Then it does. Nothing changes between these tf runs. It could also return the bad data on the first run, so its not related to a cached module directory or similar leftover artifact. It can happen from a clean clone.
I can see about sharing state somehow with you. Perhaps I can email it or allow you access to a repo. Let me inspect it.
Is it possible there is a bug in the api call that matches a server profile to a hardware name where it is not getting the right answer? I ran tf in trace mode and didn't see any mentions of the wrong EG. So something is happening that I can't pinpoint based on the run.
Since it is not reproducible everytime, we are trying to get the environment ready to reproduce first. Meanwhile will wait for the state file. Regarding your question, yes we are search the resource by name and we are checking if there is any issue with that method.
i have given you both access to my repo.
Sorry missed the invitation and now it is expired. Can you please send it again?
were you able to get in?
Yes, we are able to access it now. Will check and get back to you
When we do a read operation we first do a GET call by using the server profile name. While retreiving that if we get the wrong server profile , then this scenario can happen. We have tried multiple time with name searches, but it always gave the correct server profile. Looking at the state file, we can see that ID field is correct so we are not able to debug it at present. Will keep looking and may ask for more details.
sure. if you need me to compile or build some test code, around outputting the underlying GET calls, that might help isolate it to the underlying go oneview library?
I am not a golang programmer, but perhaps that might eliminate this provider's direct usage of that, and indicate a problem in the getter? Perhaps there is an error condition or parsing problem being swallowed and then its returning improper data.
Will get back to you on this soon.
I am planning a synergy composer1 to compose2 upgrade, and a subsequent oneview upgrade to 7.x so will continue to track if this is fixed somehow in any of that effort.
Closing this ,since we did not hear from you if it is still giving issue. Please open another defect if there is any issue.
Would you be able to provide some go code I can compile to perform the get call using the server profile name? I'd like to try and repro this outside of terraform, and perform like 100 queries against oneview. Something is still randomly retrieving the wrong server profile. In powershell I have run 100s of Get-OVServerProfile -name "profilename" and can't repro... I am not sure what else to do. I am not sure how else to instrument this to get the wrong server profile.
This is still happening randomly on TF plans and applies: Using TF 0.13.6 with hewlettpackard/oneview v6.3.1-13 A rerun of the the plan/tf and this doesn't happen. I am at a loss as to how it is not reliably getting values. Somehow it is looking up the wrong hardware name, I am not sure if there is a guid lookup that is failing or something else that is not right. This directory in particular has no Enclosure Group 5 references at all.
This does not happen every time. It is maybe 25% of TF runs.
relevant sections in sp_netmapping: