Closed ptillet closed 10 years ago
Looks great. How about an API less like the C++ and more like this? I hope I've grasped the C++ semantics sufficiently. Basically, it should be possible for PyViennaCL to do the scheduler / generator dispatching intelligently, without much modification of current code; remember also, PyViennaCL doesn't execute statements until the user requests it, or the result of the computation is required by some non-PyViennaCL code.
statement = A * B # this doesn't execute anything yet, just represents the expression A * B
statement.profile = p.MatrixProductProfile(...)
statement.execute() # calls generate_enqueue_statement intelligently
If that makes sense, then great. Also, if you have a look again at the statement_wrapper class (line 135), then a viennacl::scheduler::statement object is actually created. I don't expose that object to Python because it's usually only needed once (at dispatch), and its structure might change until then. If we continue with PyViennaCL building C++ objects only when necessary, then we should only need to add the right (mostly hidden) logic for calling generate_enqueue_statement -- right?
Also, you're right, we probably should have a mailing list!
Also, I suspect that we could populate some of the profile parameters on the basis of PyViennaCL introspecting the objects (so things like 'float' could be automatically determined, perhaps?).
Hey Toby,
Yes, It is a nice shortcut, nice hint! I think we could further improve it, since the generator supports multiple statements. Once pyviennacl's statements will support assignments, we should be able to do something like this
statements = [Statement(z = 2_x + y) ,Statement(y = 3_x + z) ,Statement(x = 2*z + 5)] pyviennacl.execute(statements); #Using the optimal profile as determined by an autotuner pyviennacl.execute(statements, p.VectorSaxpyProfile(...)); #Overriding the profile
This will generate a custom opencl kernel, with some optimized bandwidth:
for(unsigned int gid = get_global_id(0) ; gid < size ; gid+=get_global_size(0){ float vx = x[gid]; float vy = y[gid]; float vz = z[gid]; //perform all the operations in registers x[gid] = vx; y[gid] = vy; vz[gid] = vz; }
Of course, you can assume that there will be soon enough a proper overload of viennacl::execute(std::vectorviennacl::statement); for that purpose.
For now, overloading viennacl::execute(statement, profile) and providing the proper wrapper should be enough: pyviennacl.execute(statement, p.SomeProfile(...))
As for the profile parameters introspection, this is what I used to do (until today). I used to parse entirely the statement, and deduce the kind of operation needed (saxpy, dgemm, etc...). Making it implicit did have some drawbacks:
2014-05-03 21:19 GMT+02:00 tsmithe notifications@github.com:
Also, I suspect that we could populate some of the profile parameters on the basis of PyViennaCL introspecting the objects (so things like 'float' could be automatically determined, perhaps?).
— Reply to this email directly or view it on GitHubhttps://github.com/viennacl/pyviennacl-dev/issues/15#issuecomment-42113959 .
Hi,
I've got a simple python -tuner working as of now.
In C++, there will be two interface functions:
viennacl::device_specific::execute(profile, std::pair<statement, statement_node> const & statement);
viennacl::device_specific::execute(profile, std::list<std::pair<statement, statement_node>> const & statements);
Where each statement_node is a root node to execute the expression from.
I've decided to wrap that into python:
template.execute(statements);
Therein, I can check for whether it's a list or a single tuple. For now, lists cannot be supported, though, since we have no control over the LHS of a statement (we only have x + y, not z= x + y), so we cannot pass stuff such as :
template.execute([Statement(z=x+y), Statement(x=z+y)]);
Am I correct?
Hi Philippe -- I'm sorry I didn't respond to your previous message yet! First of all, when you write template.execute(...)
, what is the template
object?
Sorry. I should have precised that I have renamed Profile to Template, but perhaps GenerationTemplate would be more appropriate.
Also note that we can do assignment, but because of the semantics of the =
operator in Python, you have to write Assign(LHS, RHS)
. As I'm sure you noticed, when you execute any statement in PyViennaCL, it silently constructs a holder for the result and an assign node to tell ViennaCL to store that result in the holder object.
Ah, OK.
Ah, indeed, I had not seen Assign :O Perfect then, the question is closed. I think that you can also remove the generator from the "nice-to-have" of your GSoC project. I think that I can handle anything which is generation-related now that I'm more familiar with the pyviennacl codebase. :)
Great!
Hey!
So I've got familiar with PyViennaCL's code. It's very clean so it was not very difficult. Anyway, here is how the generator works (in my development branch), in C++. Actually, there is much more flexibility if we want to pack multiple operations, but it is not useful as of now to wrap that into python (since ViennaCL's OpenCL API has not be wrapped yet).
Ideally, I'd wrap device_specific::matrix_product inside a clean python class, and use pyviennacl.Statement:
However, it seems like scheduler::statement is never wrapper (statement_wrapper is wrapped instead), so it won't be easy to do this. I propose the following solution:
Does it sound like a good solution to you? I don't have enough experience with pyviennacl to be entirely sure about the side effects it could have, if any.
PS : This kind of discussions should rather take place on a pyviennacl-dev mailing list I think, but I couldn't find any. Have I missed something? :P