-
Thanks for open-sourcing the code! I have a question - your paper seems to revolve around mono-architectural weight initialization. What if I want to use a very large pretrained ViT to initialize a mu…
-
I trained a qkeras model with kernel and bias quantizers for every QDense layer as `quantized_bits(8,0)`. After training, I print out the weights and biases of the QDense layers.
I expect them to h…
r1bhu updated
3 years ago
-
Hey
I was wondering if you could shed some light on why did you add learnable weights and biases to the sync and identity losses? To me it seemed like you were possibly trying to scale and shift but …
-
parser.py line 39:
nn.weights = np.array(weights)
Gives an error because weights is not a homogenous array:
ValueError: setting an array element with a sequence. The requested array has a…
-
Following [how we do it in LoRA recipes](https://github.com/pytorch/torchtune/blob/afd23fd0b2f9051958affae20890396e2594756f/recipes/lora_finetune_distributed.py#L475), we should add the ability to use…
-
Hi, This work is so interesting. However, I have some questions about the initialization of weights and biases in the SGP module. Due to my lack of coding experience, I can not understand why the weig…
-
See this [Neurostars post](https://neurostars.org/t/smoothing-surface-data/20159) and #2747 for a related issue.
The idea would be to have a smoothing function for surface data similar to `nilearn.…
-
-
qwen2-vl has always been memory hungry (compared to the other vision models) and even with unsloth it still OOMs when the largest llama3.2 11b works fine.
I'm using a dataset that has high resolution…
-
**FutureWarning**: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data …