Questions about the return value of lazyTensor pytorch xla subgraph

ckfgihub commented 12 months ago

Using lazyTensor, pytorch xla will generate an xla subgraph, and its subgraph will add relevant conditions to the liveTensor trained in the current step as the return of the subgraph. my question is:

What does the LiveTensor here mean, and what is the design basis for the value returned by the xla diagram? That is, what can be returned as an xla diagram. The return here refers to the ROOT node in xla.
What is the concept of xla image return here? Is it the return from the XLA device to the HOST device? Or what does it mean? Below I have given a section on using xm.mark_step() to trigger a compile and run code in the training process.

std::shared_ptr XLAGraphExecutor::SyncTensorsGraphInternal( std::vector tensors, absl::Span devices, const SyncTensorsConfig& config, bool warm_up_cache_only) { tensorflow::profiler::TraceMe activity( "SyncTensorsGraphInternal", tensorflow::profiler::TraceMeLevel::kInfo); SyncTensorCollection coll = CollectSyncTensors(tensors, config); if (coll.indices.empty()) { /* Enure previous execution is complete before exiting this

function / TensorCollectionBarrier(&coll); return nullptr; } DebugUtil::SaveTensorsGraphInfo("ScheduleSyncTensorsGraph", tensors, &coll.indices); std::vector ir_values; std::vector tensor_datavec; ExtractIRAndPrepareXlaData(tensors, coll.config, coll.indices, ir_values, tensor_data_vec); PostOrderData po_data = RunPostOrder(ir_values, &coll); coll.hash = torch::lazy::HashCombine( coll.hash, torch::lazy::Hash(po_data.parameter_sequence)); TF_VLOG(4) << "Parameter sequence graph hash " << torch::lazy::HashToString(coll.hash); std::shared_ptr async = TryRunCachedSync(tensors, &coll, &po_data, tensor_data_vec); if (async != nullptr) { return async; } CompilationResult compile_result = Compile(*tensors, devices, coll, &po_data, ir_values); TORCH_LAZY_VALUE_METRIC("TensorsGraphSize", compile_result.emitted_nodes); TF_VLOG(5) << "TensorsGraphSize=" << compile_result.emitted_nodes;

auto cached_computation = std::make_shared( std::move(compile_result.computation), compile_result.is_sharded); GetComputationCache()->Add(coll.hash, cached_computation);

if (warm_up_cache_only) { return nullptr; } else { return ScheduleSyncTensorsGraph( tensors, &coll, std::move(compile_result.parameters_data), compile_result.device.toString(), std::move(cached_computation), tensor_data_vec); } }

JackCaoG commented 12 months ago

What does the LiveTensor here mean, and what is the design basis for the value returned by the xla diagram? That is, what can be returned as an xla diagram. The return here refers to the ROOT node in xla.

LiveTensor just means all of the XLATensor that is "valid". For example if you have

a = torch.tensor(100, device = xla_device)
b = a + 2

both a and b are considered live tensors. For each pytorch tensor on XLA device we will create a XLATensor C++ class and getLiveTensor will return that I think, or in some wrapper.

I don't quite get the second question through. The way that LTC works is identify all of the tensors that needs value, for example in above computation a's value is known but b isn't. If you do a xm.mark_step() or just print(b), it will use b as a root Node and do a post-order-traversal to find out the sub graph.

What is the concept of xla image return here? Is it the return from the XLA device to the HOST device? Or what does it mean?

what xla image are you referring to here?

ckfgihub commented 12 months ago

What does the LiveTensor here mean, and what is the design basis for the value returned by the xla diagram? That is, what can be returned as an xla diagram. The return here refers to the ROOT node in xla.
LiveTensor just means all of the XLATensor that is "valid". For example if you have
a = torch.tensor(100, device = xla_device)
b = a + 2
both a and b are considered live tensors. For each pytorch tensor on XLA device we will create a XLATensor C++ class and getLiveTensor will return that I think, or in some wrapper.

I don't quite get the second question through. The way that LTC works is identify all of the tensors that needs value, for example in above computation a's value is known but b isn't. If you do a xm.mark_step() or just print(b), it will use b as a root Node and do a post-order-traversal to find out the sub graph.
What is the concept of xla image return here? Is it the return from the XLA device to the HOST device? Or what does it mean?
what xla image are you referring to here?

what xla image are you referring to here? Sorry, I made a typo. I want to say "xla computation graph"

pytorch / xla

Questions about the return value of lazyTensor pytorch xla subgraph #5569