AvivNavon / deep-align

Official implementation of Equivariant Deep Weight Space Alignment [ICML 2024]
MIT License
4 stars 1 forks source link

Data split fails #2

Closed yonatansverdlov closed 3 weeks ago

yonatansverdlov commented 1 month ago

Hi I have two issues: First, I run python experiments/utils/data/generate_splits.py --data-root datasets/mnist_classifiers --save-path datasets/splits.json to create the splits and have the following error: raise ValueError( ValueError: With n_samples=0, test_size=0.25 and train_size=None, the resulting train set will be empty. Adjust any of the aforementioned parameters. Second, can you add the networks of all other datasets like CIFAR10, LST? Thanks!

AvivNavon commented 1 month ago

Hi, could you please make sure that the all_files list (here: https://github.com/AvivNavon/deep-align/blob/main/experiments/utils/data/generate_splits.py#L16) is not empty?

yonatansverdlov commented 1 month ago

It's empty but I followed the instructions.

AvivNavon commented 1 month ago

Could you please provide the structure of the datasets/mnist_classifiers folder?

yonatansverdlov commented 1 month ago

it contains around 10K models that end with .pth. The all_files variable is empty list.

AvivNavon commented 1 month ago

But what is the structure of the datasets/mnist_classifiers folder? Are there other folders inside? Or just the *.pth files (e.g., datasets/mnist_classifiers/model_xx.pth)

yonatansverdlov commented 1 month ago

datasets/mnist_classifiers/model_xx.pth Like this with many files

בתאריך יום ד׳, 10 ביולי 2024 ב-23:59 מאת Aviv Navon < @.***>:

But what is the structure of the datasets/mnist_classifiers folder? Are there other folders inside? Or just the *.pth files (e.g., datasets/mnist_classifiers/model_xx.pth)

— Reply to this email directly, view it on GitHub https://github.com/AvivNavon/deep-align/issues/2#issuecomment-2221453381, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTIJUDOOOJ7HV77UHWMYPDZLWOCPAVCNFSM6AAAAABKUUSTGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRRGQ2TGMZYGE . You are receiving this because you authored the thread.Message ID: @.***>

AvivNavon commented 1 month ago

Are you sure you pass the correct path as --data-root? Try this please:

from pathlib import Path
data_root = "datasets/mnist_models"
data_root = Path(data_root)
all_files = [p.as_posix() for p in data_root.glob("**/*.pth")]
all_files[:10]

The output should look like this:

['datasets/mnist_models/model_899.pth', 'datasets/mnist_models/model_3082.pth', 'datasets/mnist_models/model_641.pth', 'datasets/mnist_models/model_4935.pth', 'datasets/mnist_models/model_1695.pth', 'datasets/mnist_models/model_7582.pth', 'datasets/mnist_models/model_6844.pth', 'datasets/mnist_models/model_8869.pth', 'datasets/mnist_models/model_5395.pth', 'datasets/mnist_models/model_127.pth']
yonatansverdlov commented 1 month ago

Thanks, I'll check it out. I ran python experiments/utils/data/generate_splits.py --data-root datasets/mnist_classifiers --save-path datasets/splits.json without change. Should I have done any changes inside the command?

‫בתאריך יום ה׳, 11 ביולי 2024 ב-0:16 מאת ‪Aviv Navon‬‏ <‪ @.***‬‏>:‬

Are you sure you pass the correct path as --data-root? Try this please:

from pathlib import Pathdata_root = "datasets/mnist_models"data_root = Path(data_root)all_files = [p.as_posix() for p in data_root.glob("*/.pth")]all_files[:10]

The output should look like this:

['datasets/mnist_models/model_899.pth', 'datasets/mnist_models/model_3082.pth', 'datasets/mnist_models/model_641.pth', 'datasets/mnist_models/model_4935.pth', 'datasets/mnist_models/model_1695.pth', 'datasets/mnist_models/model_7582.pth', 'datasets/mnist_models/model_6844.pth', 'datasets/mnist_models/model_8869.pth', 'datasets/mnist_models/model_5395.pth', 'datasets/mnist_models/model_127.pth']

— Reply to this email directly, view it on GitHub https://github.com/AvivNavon/deep-align/issues/2#issuecomment-2221491936, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTIJUG54ZRZ64CEQWPEEADZLWQBRAVCNFSM6AAAAABKUUSTGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRRGQ4TCOJTGY . You are receiving this because you authored the thread.Message ID: @.***>

AvivNavon commented 1 month ago

Try providing a full path to datasets/mnist_classifiers (and not relative).

yonatansverdlov commented 1 month ago

['datasets/mnist_models/model_3302.pth', 'datasets/mnist_models/model_2930.pth', 'datasets/mnist_models/model_2542.pth', 'datasets/mnist_models/model_1457.pth', 'datasets/mnist_models/model_1825.pth', 'datasets/mnist_models/model_4309.pth', 'datasets/mnist_models/model_5549.pth', 'datasets/mnist_models/model_9289.pth', 'datasets/mnist_models/model_2123.pth', 'datasets/mnist_models/model_2684.pth'] Full path yeilds the same.

AvivNavon commented 1 month ago

Try running the generate_splits.py command with full path (and maybe provide test/val sizes)

yonatansverdlov commented 1 month ago

tried but fails

‫בתאריך יום ה׳, 11 ביולי 2024 ב-0:28 מאת ‪Aviv Navon‬‏ <‪ @.***‬‏>:‬

Try running the generate_splits.py command with full path (and maybe provide test/val sizes)

— Reply to this email directly, view it on GitHub https://github.com/AvivNavon/deep-align/issues/2#issuecomment-2221527329, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTIJUH4CYI72TXSLVGKUFTZLWRPXAVCNFSM6AAAAABKUUSTGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRRGUZDOMZSHE . You are receiving this because you authored the thread.Message ID: @.***>

yonatansverdlov commented 1 month ago

I tried running what you asked explicitly and got all_files = [ ]. Although as I showed before, there are paths.

‫בתאריך יום ה׳, 11 ביולי 2024 ב-0:30 מאת יונתן סברדלוב <‪ @.***‬‏>:‬

tried but fails

‫בתאריך יום ה׳, 11 ביולי 2024 ב-0:28 מאת ‪Aviv Navon‬‏ <‪ @.***‬‏>:‬

Try running the generate_splits.py command with full path (and maybe provide test/val sizes)

— Reply to this email directly, view it on GitHub https://github.com/AvivNavon/deep-align/issues/2#issuecomment-2221527329, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTIJUH4CYI72TXSLVGKUFTZLWRPXAVCNFSM6AAAAABKUUSTGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRRGUZDOMZSHE . You are receiving this because you authored the thread.Message ID: @.***>

AvivNavon commented 1 month ago

Could you share the exact command you are using and the full trace? Also, could you please try to debug to understand why the file structure does not fit data_root.glob("**/*.pth") ?

yonatansverdlov commented 1 month ago

Regarding commands: mkdir datasets wget "https://www.dropbox.com/s/sv85hrjswaspok4/mnist_classifiers.zip" unzip -q mnist_classifiers.zip -d datasets python experiments/utils/data/generate_splits.py --data-root datasets/mnist_classifiers --save-path datasets/splits.json

‫בתאריך יום ה׳, 11 ביולי 2024 ב-9:43 מאת ‪Aviv Navon‬‏ <‪ @.***‬‏>:‬

Could you share the exact command you are using and the full trace? Also, could you please try to debug to understand why the file structure does not fit data_root.glob("*/.pth") ?

— Reply to this email directly, view it on GitHub https://github.com/AvivNavon/deep-align/issues/2#issuecomment-2222153480, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTIJUADXC7GN2NEBWYONLDZLYSPZAVCNFSM6AAAAABKUUSTGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRSGE2TGNBYGA . You are receiving this because you authored the thread.Message ID: @.***>

AvivNavon commented 1 month ago

I think I see the problem, the subfolder is called mnist_models and not mnist_classifiers

yonatansverdlov commented 1 month ago

Ok so what is the fix?

‫בתאריך יום ה׳, 11 ביולי 2024 ב-9:53 מאת ‪Aviv Navon‬‏ <‪ @.***‬‏>:‬

I think I see the problem, the subfolder is called mnist_models and not mnist_classifiers

— Reply to this email directly, view it on GitHub https://github.com/AvivNavon/deep-align/issues/2#issuecomment-2222171772, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTIJUE277DDT3WPL4KFKJTZLYTVDAVCNFSM6AAAAABKUUSTGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRSGE3TCNZXGI . You are receiving this because you authored the thread.Message ID: @.***>

AvivNavon commented 1 month ago

python experiments/utils/data/generate_splits.py --data-root datasets/mnist_models --save-path datasets/splits.json
AvivNavon commented 1 month ago

Also, I suggest providing exact sizes for the test/val splits using --test-size and --val-size

yonatansverdlov commented 1 month ago

Worked but inside generate.py what val/test splited you took in the paper?

‫בתאריך יום ה׳, 11 ביולי 2024 ב-9:59 מאת ‪Aviv Navon‬‏ <‪ @.***‬‏>:‬

python experiments/utils/data/generate_splits.py --data-root datasets/mnist_models --save-path datasets/splits.json

— Reply to this email directly, view it on GitHub https://github.com/AvivNavon/deep-align/issues/2#issuecomment-2222181579, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTIJUGJPKKCLFOQ2BZKTYDZLYUOVAVCNFSM6AAAAABKUUSTGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRSGE4DCNJXHE . You are receiving this because you authored the thread.Message ID: @.***>

AvivNavon commented 1 month ago

I believe we provide the full experimental details in the Appendix of the paper

yonatansverdlov commented 1 month ago

And what about other datasets?

‫בתאריך יום ה׳, 11 ביולי 2024 ב-10:07 מאת יונתן סברדלוב <‪ @.***‬‏>:‬

Worked but inside generate.py what val/test splited you took in the paper?

‫בתאריך יום ה׳, 11 ביולי 2024 ב-9:59 מאת ‪Aviv Navon‬‏ <‪ @.***‬‏>:‬

python experiments/utils/data/generate_splits.py --data-root datasets/mnist_models --save-path datasets/splits.json

— Reply to this email directly, view it on GitHub https://github.com/AvivNavon/deep-align/issues/2#issuecomment-2222181579, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTIJUGJPKKCLFOQ2BZKTYDZLYUOVAVCNFSM6AAAAABKUUSTGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRSGE4DCNJXHE . You are receiving this because you authored the thread.Message ID: @.***>

AvivNavon commented 1 month ago

We will make the effort to release other datasets and the supporting code in the future

AvivNavon commented 3 weeks ago

We've released the code for the CNNs experiments

yonatansverdlov commented 3 weeks ago

Awesome, thanks for the update! Does it include also CIFAR exps?

On Tue, Jul 30, 2024, 21:56 Aviv Navon @.***> wrote:

We've released the code for the CNNs experiments

— Reply to this email directly, view it on GitHub https://github.com/AvivNavon/deep-align/issues/2#issuecomment-2259005210, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVTIJUAEXEHJOBOVYQWKFU3ZO7OUXAVCNFSM6AAAAABKUUSTGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJZGAYDKMRRGA . You are receiving this because you authored the thread.Message ID: @.***>

AvivNavon commented 3 weeks ago

Yes.