Open ssttevee opened 4 years ago
hi, thanks for filing the issue
the 365k output you saw is the manifest which basically contains a fs layout of everything that was put in to venv - following those steps, for me, that's something like: 80 directories, 1799 files
that log full message is a known bug that is resolved in master - you can get around that by using 'ops load -n' - this will utilize the nightly build (builds from master)
let me know if this gets you further down the road
as for building other packages - that's definitely a key possibility - if you think it might be good for us to have an official tensorflow package happy to work with you on one for that as I totally agree the average user shouldn't be having to build the base pkg -- very loose instructions are in:
The -n
flag seemed to do the trick, at least for ops load
. My real goal was to get it running on gcp, but ops image create
also suffers from the same issue and it doesn't seem like it takes -n
.
I think it would probably be more worthwhile to add a mechanism that sniffs out linked libraries and automatically include them from the working environment. That would make creating packages and images much more streamlined for everyone, first-party or otherwise.
hrm... that's pretty lame but definitely something we can add support for https://github.com/nanovms/ops/issues/478
as for your other comment we already automatically include linked libraries ala ldd, but for things that aren't explicit yes they need to go into pkgs
Oh, I see that in the code now. From what I can tell, it looks like it only works for the main elf binary.
In this case, for example, numpy shared object, which requires libffi, is imported at runtime so it isn't included. Perhaps it should be expanded to include all files? or all files with a certain extention/regexp or in a certain folder? or at least add a new config option to declare files to sniff?
EDIT: I think that is something that I may be able to do, if pull requests are accepted.
pull requests are definitely accepted, although in this case it sounds like a tensorflow package would be more appropriate - i don't know if a regex would cut it in this case - for what you are proposing you'd really want to dump the ast of the python script in question and from looking at debugging output even then the interpreter itself will look for shared libs in a half-dozen diff. places
we do this a lot for jvm based applications where we take a particular java package && then add in the framework or whatever on top
if you want to take stab at building a tensorflow pkg lmk otherwise I can try and whip one up - we have limited user created packages currently via ops load --local
(load local package)
I was thinking a more along the lines of sniffing linked libraries of any .so
included files.
I'll leave that to you because I don't think I'll be able to use nanos for my current project again. I need fork for parallelizing python but it's not implemented :(
I am unable to package any python program that includes tensorflow :(
Environment:
Minimal reproduction steps:
Contents of config.json:
Resulting error:
Removing the
MapDir
property in config.json gives in the expected result:PS: The ops/nanos experience with python is really lacking. The examples are too trivial and documentation is poor. I had to jump through many hoops before I figured out that I can just map my venv folder to
/.local
. I don't think I would've gotten this far if I had any less python experience. Also, libraries like libffi are probably worth bundling in the official pkg.