Closed lonniev closed 9 years ago
Hey @lonniev,
Could you please send us the complete Vagrantfile
and the contents of the Vagrant command running in debug mode. Because this output is rather lengthy, please use a service like GitHub Gist, Pastebin, or send the logs as an email attachment.
@sneal have you seen this with Windows before?
https://gist.github.com/lonniev/8ab5dacf10cb877cf717
That's just the Vagrantfile. The VAGRANT_LOG debug is vast. Anything you want to focus on? Naively, I observe it is doing a lot of the right expected stuff and then, at the end, without a lot of helpful context, it makes the same complaint about the empty run list and unwinds the uncaught exception. I can capture the whole log of course but if you want me to grep out something, I could save an electronic forest with a little filtering first.
@lonniev can you just gist the entire debug log please? It's expected to be really long :smile:
It hurts to send it, but here it is https://gist.github.com/lonniev/92a6395cd13b5a4ad32c/raw/gistfile1.sh-session
And here's the chef-solo-2 roles folder on the guest https://www.dropbox.com/s/0mllufhetu8tnwc/Screenshot%202014-12-14%2022.18.47.png?dl=0
name "tasktop-sync-server"
description "Role for a Server running Tasktop Sync Studio"
run_list(
"recipe[tasktop-sync-studio]"
)
I'll debug more tomorrow.
I've been seeing this as well. As I'm fairly new to vagrant and chef, I thought it was something I was doing.
config.vm.define "win" do |win|
win.vm.provision "chef_solo" do |chef|
# Want to use smb for windows, but there's a bug in 1.6.5
# that stops it working. So virtualbox for now
chef.synced_folder_type = :virtualbox
chef.cookbooks_path = "./chef-library"
chef.roles_path = "./chef-data/roles"
chef.data_bags_path = "./chef-data/data_bags"
chef.add_role "dev"
end
end
{
"name": "dev",
"description": "Role for development. Edit as required.",
"json_class": "Chef::Role",
"default_attributes": {
},
"override_attributes": {
"chef_client": { }
},
"chef_type": "role",
"run_list": [
"notepadplusplus"
],
"env_run_lists": {
}
}
I have an identical config for a linux machine, and that works as expected.
Hunch - what if you specify the full path to the roles (instead of relative)?
I'll try, but currently I'm blocked by #4976.
@sethvargo Tried your suggestion. Made no difference.
edit:
Also tried chef.synced_folder_type = :smb
but that didn't help either.
Some more info in this gist, where I ran the guest scripts directly.
The path and seems to exist. How exactly does vagrant tell chef-solo where to look for the roles, i.e. where the paths are?
Actually, looking at your gist, I see that you deduced all what I say below. Your command-line call is exactly what the elevated powershell script calls when vagrant commands the guest to run chef-solo.
@MartinSGill config-solo, for a vagrant use case, seems to be configured with a solo.rb and a dna.json. The former file provides configuration settings such as the paths to cookbooks, data bags, environments, and roles while the latter (in my instances) provides the run list of recipes and roles.
These two files are found on a Windows guest at c:\tmp\vagrant-chef-2
.
The contents of these files comes from statements in the chef provisioner block within the Vagrantfile.
Beyond this reply, I defer to the Vagrant and Opscode folk who know more and can correct my flawed assumptions.
@sethvargo when stating the location of the paths within the chef provisioner block of the Vagrantfile, my assumption is that the pathnames should reference (relatively or absolutely) the directories as they are located on the Host OS. Then, it is vagrant and the chosen provisioner that take care of constructing mount points within the guest OS and mapping the external Host directories as stated to the internal Guest mount points (or physically importing and moving files to achieve the same copied effect).
Given that, how could it matter if the statements in the Vagrantfile used relative versus absolute paths if we can see that the vagrant has mapped, mounted, or moved all the necessary files to the absolute locations specified in the guest's solo.rb configuration file?
I updated my solo.rb to use proper windows paths:
if Chef::VERSION.to_f < 11.8
role_path "c:/tmp/vagrant-chef-2/chef-solo-2/roles"
else
role_path ["c:/tmp/vagrant-chef-2/chef-solo-2/roles"]
end
basically I add c:
to the beginning and it now works correctly. Does this mean vagrant is generating the solo.rb incorrectly?
If solo.rb is parsed by all the same file I/O libraries that enable Ruby File to do the right magic on Windows, then I would assume that the chef-solo would translate the unixy paths in solo.rb to DOS ones correctly.
However, given your test, I now am uncertain.
Also, why would chef-solo be able to resolve the very similar cookbook path but not the role path?
That suggests that the solo.rb file is following the correct convention (leave all paths in Ruby-standard form) and it is code within chef-solo for the Windows platform that is not doing the transformation to DOS paths that the code also within chef-solo for the Windows platform does do for the cookbook path.
That then is a clue viable enough to lead one to open up the chef-solo code and simply compare the two bits of code.
It looks like in September code was added to all the file retrievers to do Dir.glob
lookups for suitable files. Given this, it's likely that the lookups are now the same for each kind of file (cookbook, role, etc.).
What may be possible is that the run_list.expand
method might be returning an error other than "file not found" and the code that raises the Chef::Exceptions::MissingRole exception only assumes that the reason expand
would fail is because the file(s) could not be found.
I'm taking a look to see if expand
might fail (on Windows, for role files) for other reasons than just a missing file. (For example, it could be that a too-long filename string or "-" characters in the filenames might cause a pattern matcher to return a false negative.)
@lamont-granquist is the current "owner" of RunListExpansion; he may be a wise mentor here.
Being chatty here:
# Expand a run list from disk. Suitable for chef-solo
class RunListExpansionFromDisk < RunListExpansion
def fetch_role(name, included_by)
Chef::Role.from_disk(name)
rescue Chef::Exceptions::RoleNotFound
role_not_found(name, included_by)
end
end
and
# Load a role from disk - prefers to load the JSON, but will happily load
# the raw rb files as well. Can search within directories in the role_path.
def self.from_disk(name)
paths = Array(Chef::Config[:role_path])
paths.each do |path|
roles_files = Dir.glob(File.join(Chef::Util::PathHelper.escape_glob(path), "**", "**"))
What this suggests is that @MartinSGill has suggested an immediate hackaround for us: simply add both the local relative unixy path and the mapped absolute dos path to the chef.roles_path
array argument. Ideally, the hardcoded dos path SHOULD be unnecessary but the second loop of the above paths.each block would then find the sought file as it visited the hardcoded path with proper DOS path format.
Where does the DOS drive letter get prefixed to the paths?
Also, the chef-solo file retrievers could validate that the configured paths are free of cruft characters and within the DOS length limit--using the check methods offered in this little path_helper class.
For now, I'll use the hackaround and give the owners of these files a chance to remark on what pilot errors I may be suffering or on how best to improve the identified source files.
Dang, vagrant is too smart: the array of paths in the Vagrantfile must be resolvable local paths and it drops from that array any paths which aren't parseable or existing on the Host OS. So, the hack addition isn't passed along in the solo.rb file.
Perhaps the Vagrantfile and the Chef code are both ok and the issue is related to the fact that the collection of chef directories under "tmp" is a collection of Windows symlinks to various \vboxsrv-located directories.
Can the Ruby File methods properly traverse those links or do they run into I/O errors as they try to open or stat the components of the paths?
I don't think the changes to RunListExpansion should have affected this. It looks more like you've got issues in Chef::Config dealing with how the role_path is expanded via Chef::Util::PathHelper stuff, and I'm not the expert there. The "escape_glob()" function is necessary on windows when using Dir.glob and needs to not include the actual glob characters, so that function looks correct. It matches the same usage of that function that we have elsewhere in the provider code.
There also seems to be some confusion in this issue around the use of role_path
vs. roles_path
as the configuration option and passing an array vs. a string into the config. The correct argument for chef is role_path
not roles_path
. In Chef 12 you should also be able to simply set chef_repo_path
to the parent directory and have everything resolved to subdirs of that. The intent is that both arrays and strings will work for all of them. Sometimes there's other external gems which parse Chef::Config that haven't been updated for that usage, although the latest Berkshelf/ridley has been updated for that. I know knife-spork also made assumptions about cookbook_path being either a string or array which got recently fixed as well.
@lamont-granquist I think you are right about the source of the issue being within either Chef::Util::PathHelper or back with roles/role_path attribute.
Note I was using chef.roles_path
in my Vagrantfile. When I switch that to chef.role_path
, vagrant then complains:
Lonnies-MacBook-Pro:virtualbox-tasktopsync lonniev$ vagrant provision vb-tt-sync
There are errors in the configuration of this machine. Please fix
the following errors and try again:
chef solo provisioner:
* The following settings shouldn't exist: role_path
* The following settings shouldn't exist: role_path
So, I am forced to use roles_path
and it does set role_path
in the solo.rb.
@sethvargo seems to be confirming this unfortunate use of the attribute names roles_path to role_path.
It is just a smoky day in the Chef kitchens. More burned cr*p:
Lonnies-MacBook-Pro:virtualbox-tasktopsync lonniev$ vagrant plugin update
Updating installed plugins...
Updated 'chef' to version '12.0.1'!
Updated 'vagrant-aws' to version '0.6.0'!
Updated 'vagrant-azure' to version '1.0.5'!
Updated 'vagrant-share' to version '1.1.4'!
Lonnies-MacBook-Pro:virtualbox-tasktopsync lonniev$ vagrant provision vb-tt-sync
==> vb-tt-sync: Installing Chef cookbooks with Librarian-Chef...
/Users/lonniev/.vagrant.d/gems/gems/vagrant-azure-1.0.5/lib/vagrant-azure/communication/powershell.rb:24:in `ready?': undefined method `check_winrm' for #<VagrantPlugins::ProviderVirtualBox::Driver::Meta:0x000001041f59c8> (NoMethodError)
from /Users/lonniev/.vagrant.d/gems/gems/vagrant-omnibus-1.4.1/lib/vagrant-omnibus/action/install_chef.rb:40:in `call'
@sneal ^. Halp.
Welp vagrant-azure, i'm outta here... good luck!! lol...
Paging @adamex and @randomcamel and @jdmundrawala who might have more useful knowledge this deep into chef+windows. I'm way over my head at this point.
All I can think of is that it might be better to try Berkshelf rather than Librarian and see if that works any better. TK will eventually support windows/azure (soon), but doesn't right this moment so that isn't an alternative. I'm in over my head tho.
Np, Lamont. I don't think it's a librarian chef thing because all librarian does is populate the cookbook directories on the host in an automated way. Once chef-solo is off and running, librarian isn't part of the process. It is due, I think, to something with Windows path names and perhaps symlinks.
—Lonnie VanZandt
303-900-3048 Sent from Dropbox's Mailbox on Mac
On Mon, Dec 15, 2014 at 1:28 PM, Lamont Granquist notifications@github.com wrote:
Welp vagrant-azure, i'm outta here... good luck!! lol... Paging @adamex and @randomcamel and @jdmundrawala who might have more useful knowledge this deep into chef+windows. I'm way over my head at this point.
All I can think of is that it might be better to try Berkshelf rather than Librarian and see if that works any better. TK will eventually support windows/azure (soon), but doesn't right this moment so that isn't an alternative. I'm in over my head tho.
Reply to this email directly or view it on GitHub: https://github.com/mitchellh/vagrant/issues/4974#issuecomment-67059602
@lonniev The NoMethodError is from the vagrant-azure plugin communicator trying to call a non-standard driver method on the VBox provider which does not implement the check_winrm method.
This was working just before the vagrant plugin update. Because I am not currently working with an Azure provider, I may be able to work around it by uninstalling the vagrant-azure plugin. Ideally, though, the vagrant-azure team, the vagrant-windows team, and the chef team regression test their updates. Right?
—Lonnie VanZandt
303-900-3048 Sent from Dropbox's Mailbox on Mac
On Mon, Dec 15, 2014 at 6:47 PM, Shawn Neal notifications@github.com wrote:
@lonniev The NoMethodError is from the vagrant-azure plugin communicator trying to call a non-standard driver method on the VBox provider which does not implement the check_winrm method.
Reply to this email directly or view it on GitHub: https://github.com/mitchellh/vagrant/issues/4974#issuecomment-67100096
This issue is specific to roles in Chef 12. I took a working provisioned Windows guest and upgraded from Chef 11 to Chef 12 (without changing anything else) and now I get the same error as the original poster.
==> default: Running chef-solo...
==> default: [2014-12-16T11:03:23-08:00] INFO: *** Chef 12.0.1 ***
==> default: [2014-12-16T11:03:23-08:00] INFO: Chef-client pid: 2848
==> default: [2014-12-16T11:03:49-08:00] INFO: Setting the run_list to ["role[dotnetframework]"] from CLI options
==> default: [2014-12-16T11:03:51-08:00] ERROR: Role dotnetframework (included by 'top level') is in the runlist but does not exist. Skipping expand.
==> default:
==> default:
==> default: ================================================================================
==> default: Error expanding the run_list:
==> default: ================================================================================
==> default:
==> default: Missing Role(s) in Run List:
==> default: ----------------------------
==> default: * dotnetframework included by 'top level'
==> default:
==> default: Original Run List
==> default: -----------------
==> default: * role[dotnetframework]
==> default:
==> default: [2014-12-16T11:03:51-08:00] FATAL: Stacktrace dumped to C:/var/chef/cache/chef-stacktrace.out
==> default: [2014-12-16T11:03:51-08:00] FATAL: NoMethodError: undefined method `run_id' for nil:NilClass
Directly running chef-solo on the guest from PowerShell using the same config files generated by Vagrant also produces the same error. However if I edit the generated solo.rb file and add a drive letter to the roles path, everything works correctly.
role_path "/tmp/vagrant-chef-3/chef-solo-2/roles"
-> role_path "c:/tmp/vagrant-chef-3/chef-solo-2/roles"
While yes, Chef 12 did change behavior here, really the issue is that the Vagrant Chef provisioners should use Windows specific/friendly paths when generating the solo.rb file. This should be done for all the paths in solo.rb and not just the roles path. The reason Vagrant doesn't yet generate Windows specific paths is largely a legacy of the old vagrant-windows plugin which tried to minimize changes to Vagrant core.
If you must use roles with the current version of Vagrant and Chef 12 then you might be able to use the Vagrant Chef provisioner custom_config_path
option. That would point to a custom solo.rb file which contains the Chef role_path setting using a drive letter prefixed path.
@sneal thank you so much for looking into this. I would never have figured that out. Are you sure this shouldn't be reported as a regression in Chef 12 though? Seems like that shouldn't change to me.
Also, should not all paths in vagrant and chef use the Ruby convention and require any code that is serializing paths to strings for OS-specific files to be responsible for decorating those paths for particular OSs? Higher level code should not have to deal with this stuff.
I believe the workaround suggested here is to ask vagrant authors to craft their own solo.rb files with OS-specific cruft when the vagrant is responsible for provisioning Windows guests. Yes? That seems to be bitter medicine.
—Lonnie VanZandt
303-900-3048 Sent from Dropbox's Mailbox on Mac
On Tue, Dec 16, 2014 at 1:43 PM, Seth Vargo notifications@github.com wrote:
@sneal thank you so much for looking into this. I would never have figured that out. Are you sure this shouldn't be reported as a regression in Chef 12 though? Seems like that shouldn't change to me.
Reply to this email directly or view it on GitHub: https://github.com/mitchellh/vagrant/issues/4974#issuecomment-67228386
should not all paths in vagrant and chef use the Ruby convention and require any code that is serializing paths to strings for OS-specifi
Yes, but that is much more difficult that you make it sound :smile:
Ah, but that is the price we choose when we offer to write code to make life easier for others. If that was a viable excuse, none of us would code automation for others. We’d just hack on our own one system in bash for our own little selves.
If it is difficult, write it once, modularize it, and make it easier for everyone else.
Actually, naively, I would expect Ruby::Pathname to have some nice convenience method for us like #to_osformat( os=:unix ) which would handle OS-BS.
—Lonnie VanZandt
303-900-3048 Sent from Dropbox's Mailbox on Mac
On Tue, Dec 16, 2014 at 2:12 PM, Seth Vargo notifications@github.com wrote:
should not all paths in vagrant and chef use the Ruby convention and require any code that is serializing paths to strings for OS-specifi
Yes, but that is much more difficult that you make it sound :smile:
Reply to this email directly or view it on GitHub: https://github.com/mitchellh/vagrant/issues/4974#issuecomment-67232800
@lonniev it's not the "format" - replacing slashes is easy:
if File::ALT_SEPARATOR
path.gsub(File::SEPARATOR, File::ALT_SEPARATOR)
else
path
end
The hard part is figuring out the drive letter. It could be c:/
which is the most common, but it could also be any other letter of the alphabet depending on the guest box.
I'm not making an excuse, I'm just asking you to avoid trivializing the issue :smile:. The only native support Ruby has for something like this is FIle::ALT_SEPARATOR
as I showed above, but that does not solve the drive letter issue. We will definitely fix this as it's a real bug (and I'll tag it appropriately), but it is going to take a little bit longer than a simple two or three line change. That's what I was trying to convey by my previous comment, and I apologize if that was unclear.
@sethvargo You're right. I filed a new Chef issue https://github.com/opscode/chef/issues/2666 so at least they're aware of the behavior change.
@sethvargo, no problem, I understand. Yes, I know the drive letter is the hard part (I’m a former unix filesystem developer) because Unix has a nice unified filesystem architecture while Windows doesn’t. If I was writing #to_osFormat(), I might accept an optional hash of OS-specific junk. Perhaps therein I’d allow a drive tag to default to “c:\”. Making that assumption seems to be pervasive in ruby/vagrant/chef land. It might also be deduced from %USER% or %WINNT% or some other probably-safe assumption.
I’m mainly surprised that this is a new issue.
Anyway, thanks guys for attending to it: I’ll await progress and I won’t trivialize what you are doing.
I suspect this is probably a Chef 12 regression where we tightened up what we accepted but weren't considering this use case, and I've marked the associated chef bug as such.
I am able to work around the role_path
lack-of-a-drive-letter-prefix issue https://github.com/opscode/chef/issues/2666 with this workaround:
in the Vagrantfile, within the block for the chef provisioner, add:
chef.custom_config_path = "ChefConfig.WindowsMonkeyPatch"
Then, add the new file ChefConfig.WindowsMonkeyPatch
and provide the following content:
role_path "c:/tmp/vagrant-chef-2/chef-solo-2/roles"
This will replace the (mis)computed value of the role_path setting with the known actual well-formed absolute Windows path to that same directory.
Note that you may name the file anything legitimate and that the absolute Windows path will depend on the order of other chef.*_path lines in your provisioner block. To determine the right absolute path, examine the runtime VM when the chef.role_path "roles"
line is specified and then use the actual path discovered in the running Windows VM after the MissingRole exception is thrown.
Ohai! I'm going to close this as we have determined this is a regression in Chef 12. If Chef determines this is "not a bug", we can revisit potential ways to fix this in Vagrant, but this is really an upstream bug.
I am getting the exact same issue on Mac OSX. Any solution?
FYI. I am hitting this issue as well @lonniev . My setup is Vagrant 1.7.2 on Windows 2012r2 using Microsoft Hyper-V as provider to create Windows 2012r2 VM with Chef-solo 12.2.1 (apparently I was being silly to think latest is greatest) in that VM.
I was happy with Vagrant 1.7.1 with chef-solo 11.14.2 . The only reason I moved is because the "sql_server" cookbook reveal a bug in the older version of chef-solo. :( That was my day.
I am just glad someone knows about this.
I just downgraded to Chef client version 11.18.6 and the role problem went away. :+1:
Hopefully this helps others.
This doesn't look like a bug in Chef to me. Rather, it's a bug in the default value Vagrant assigns to the provisioning_path
config property.
By default, this property is set to /tmp/chef-vagrant
, even on Windows guests.
provisioning_path (string) - The path on the remote machine where Vagrant will store all necessary files for provisioning such as cookbooks, configurations, etc. This path must be world writable. By default this is /tmp/vagrant-chef-# where "#" is replaced by a unique counter.
/tmp/vagrant-chef
sounds pretty Linux-y. I don't get the error if I set that config property to something more Windows-y:
config.vm.provision 'chef_zero' do |chef|
chef.provisioning_path = 'C:\vagrant-chef'
chef.roles_path = 'roles'
chef.add_role 'windowsnode'
end
Maybe something along the lines of updating the default value of provisioning_path in vagrant/plugins/provisioners/chef/config/base_runner.rb
if the guest is Windows? I can open a new issue for this.
The path is fine. This works in chef 12.4.0.rc.1
Generally, the issues all arise with dealing with Windows pathname drive letters and backslashes that get interpreted by Ruby regex as escaping codes.
(I need to turn off a Chef Config monkey patch to assure that 12.4.0.rc.1 gets it right but I haven’t had the role issue myself for a week or so with 12.4.0.rc.1 — but the patch may still be doing its workaround. Your conversation here is a good kick to me to go see if I still need it.)
-- Lonnie VanZandt 303-900-3048
On 24 June 2015 at 13:42:25, NickMRamirez (notifications@github.com) wrote:
This doesn't look like a bug in Chef to me. Rather, it's a bug in the default value Vagrant assigns to the provisioning_path config property.
By default, this property is set to /tmp/chef-vagrant, even on Windows guests.
Shared Chef Options:
provisioning_path (string) - The path on the remote machine where Vagrant will store all necessary files for provisioning such as cookbooks, configurations, etc. This path must be world writable. By default this is /tmp/vagrant-chef-# where "#" is replaced by a unique counter.
/tmp/vagrant-chef sounds pretty Linux-y. I don't get the error if I set that config property to something more Windows-y:
config.vm.provision 'chef_zero' do |chef| chef.provisioning_path = 'C:\vagrant-chef'
chef.roles_path = 'roles'
chef.add_role 'windowsnode'
end Maybe something along the lines of updating the default value of provisioning_path in vagrant/plugins/provisioners/chef/config/base_runner.rb if the guest is Windows? I can open a new issue for this.
— Reply to this email directly or view it on GitHub.
Hmm...I wonder what they're doing in the release candidate. Because even escaped,
\tmp\vagrant-chef
doesn't get interpreted right by Windows. Really, /tmp isn't a normal Windows directory. Putting a drive letter at the front makes the problem go away, via File.absolute_path or File.expand_path, but the feedback I got when I submitted a pull request in Chef that does that was that there might be unintended consequences if the user changes the running process' current working directory.
The vagrant base box I built for SoftLayer does create a c:\tmp directory. I need to check my state of affairs with what I have working with 12.4.0.rc.1. I may just be lucky.
-- Lonnie VanZandt 303-900-3048
On 24 June 2015 at 14:01:28, NickMRamirez (notifications@github.com) wrote:
Hmm...I wonder what they're doing in the release candidate. Because even escaped,
\tmp\vagrant-chef
doesn't get interpreted right by Windows. Really, /tmp isn't a normal Windows directory. Putting a drive letter at the front makes the problem go away, via File.absolute_path or File.expand_path, but the feedback I got when I submitted a pull request in Chef that does that was that there might be unintended consequences if the user changes the running process' current working directory.
— Reply to this email directly or view it on GitHub.
I opened another issue for the Vagrant default paths, just to see what the consensus is:
I was just lucky. If I remove the line:
chef.custom_config_path = "ChefConfig.WindowsMonkeyPatch"
where the following is set on the guest:
role_path "c:/vagrant/roles"
Then chef role-specified provisioning is busted because the roles path can't be resolved.
I apologize for being optimistic; I thought I had removed this bandaid.
Similiar to (closed) https://github.com/mitchellh/vagrant/issues/2818
The Vagrantfile includes:
The entire configuration works fine for Ubuntu guest boxes. The provider is virtualbox. All the proper files are in the right directories on the host, all the right mounts of shared folders are in place, all the remapped files are available on the guest Windows, all the files can be opened from the Windows guest file explorer, and the solo.rb file has the proper, mapped paths for cookbooks, roles, and bags. If chef.add_role is replaced with chef.add_recipe for a recipe that is in the cookbooks folder, then that works fine. It is just the role access that is failing.
So, what could be the problem?
(Is there a unix-dos CRLF issue at the root of this?)