Open zhangguanheng66 opened 5 years ago
A general guideline for testing quality:
Ship nightlies in the future?
Nightlies are already being generated for torchvision and torchaudio. For example, see https://anaconda.org/pytorch-nightly/torchvision/files
Can we also have the nightlies for torchtext? :)
Sure, just copy paste the code accordingly ;)
@ezyang Isn't torchtext a Python-only package?
I tried to trigger a nightly job for torchvision on Windows. The conda jobs passed, while the wheels jobs are currently blocked by https://github.com/pytorch/pytorch.github.io/pull/244/files#r316548499. However, the biggest question is that on which machines are we going to build the binaries with. Are we going to rely on the hosted agents of the online CI or our own agents? The latter way is currently used when we build nightlies for PyTorch.
@ezyang Isn't torchtext a Python-only package?
@peterjc123 torchtext is a python-only package now. However, there is a PR for C++ extension (basic_english_normalize function), and we plan to have a C++ dictionary this half (depending on the results of sentencepiece binding)
However, the biggest question is that on which machines are we going to build the binaries with. Are we going to rely on the hosted agents of the online CI or our own agents? The latter way is currently used when we build nightlies for PyTorch.
Well, in Linux, we rely on hosted CircleCI for the binaries, and this is probably going to continue to be the case. I'm not too sure about Windows though; I think we should whatever you, @peterjc123, thinks makes the most sense.
@ezyang Could we have some tests that trying to build binaries on WS 2016 and use it on Win7 or WS 2012 R2? If it works, then we can start to build window containers instead of configuring environments before every build. also cc @yf225
Yes that SGTM. Is there something specific you would like me to do to try to make this happen? One thing that seems possible is to resurrect the WS 2016 Windows AMI and try to shift the CI over to it (since we now know that switching to ninja fixes the build failures.)
I guess I will need two EC2 machines, one with WS 2008 r2 and one with WS2016/2019.
Assigning myself for Windows EC2 machines. Do you need GPUs on these too?
@ezyang Yes, I just want to tests the CUDA binary compatibility between these OSes.
WS 2008 may not be so easy; I literally cannot get Packer to log into the WS 2008 base image (I'm using
"source_ami_filter": {
"filters": {
"name": "Amazon/Windows_Server-2008-R2_SP3-English-64Bit-Base-*"
},
"owners": ["956863127205"],
"most_recent": true
},
@peterjc123 Are machines booted from the stock images acceptable?
@ezyang Sure.
However, the biggest question is that on which machines are we going to build the binaries with. Are we going to rely on the hosted agents of the online CI or our own agents? The latter way is currently used when we build nightlies for PyTorch.
Well, in Linux, we rely on hosted CircleCI for the binaries, and this is probably going to continue to be the case. I'm not too sure about Windows though; I think we should whatever you, @peterjc123, thinks makes the most sense.
Oh, I just found out that I missed that post. I think it will be better if we could have our own nightly build machines. Because the parallelism of Azure Pipelines is only 10x. It will block the CI tests of the main repo during the building process of nightlies. Currently, we are depending on the three build machines provided by Microsoft, which we could not directly take control of.
Because the parallelism of Azure Pipelines is only 10x.
Our own nightly build machines are possible. However, we might be able to increase the parallelism of Azure Pipelines. Let me talk to the relevant people.
@peterjc123 has got his Windows machines, unassigning myself.
@ezyang I managed to run some simple smoke tests using the CUDA binary on WS 2008 R2 that is generated on WS 2016. But when I tried to run test_cuda.py
, there are many unspecified launch errors
, but this is the same for our current 1.2.0 binaries.
Hmm... I wonder if a build from source on WS 2008 would be OK. But that doesn't sound promising :(
@ezyang No need to test this, it's probably related to the TDR settings. Also, this is too old, even not listed in the supported OS in the CUDA 10 document.
@ezyang Is it possible to get a Win7 AMI on EC2?
https://www.quora.com/Can-we-launch-a-Windows-7-instance-on-AWS-If-so-what%E2%80%99s-the-whole-process seems to imply it's not possible. You'll probably have to VirtualBox it or something :/
@ezyang Okay, I just tested the binary on WS 2012 R2, it seems it is working fine and the CUDA tests passed. ~It makes wondering what is the difference between the CUDA_win10_setup.exe and CUDA_win_setup.exe. Someone mentioned that there's something related to WDDM, but if that's true, how can we use these libraries on a different OS.~
Nice! So that means we can use CircleCI for binary builds? (Also, did you see their message that they have beta GPU support on Windows now)
@ezyang Yes, we could use WS 2019 for building binaries. BTW, could you please apply https://github.com/pytorch/audio/pull/219 to pytorch/vision
so that I can learn how to use CircleCI on Windows quickly?
🚀 Feature
Based on the retro meeting following PyTorch 1.2.0 release, the team agreed to improve binary release process across PyTorch domain libraries:
A few general points:
Binary release for Windows
CC @soumith @cpuhrsch @ezyang @peterjc123
cc @ezyang