FMInference / DejaVu

268 stars 32 forks source link

Predictor without activation function? #1

Closed YixinSong-e closed 1 year ago

YixinSong-e commented 1 year ago

I saw in the code:

 query_layer = torch.nn.Sequential(
        torch.nn.Linear(CONFIG[args.model]['d'], args.D, bias=None),
        torch.nn.Linear(args.D, CONFIG[args.model]['d']*4, bias=None),
    )

It seems that we use small MLP without activation function to predict , which means that the layers are linearly separable?

czq693497091 commented 5 months ago

I saw in the code:

query_layer = torch.nn.Sequential(
     torch.nn.Linear(CONFIG[args.model]['d'], args.D, bias=None),
     torch.nn.Linear(args.D, CONFIG[args.model]['d']*4, bias=None),
 )

It seems that we use small MLP without activation function to predict , which means that the layers are linearly separable?

Hi, it seems that you have successfully run this project. Now I try to run DejaVu/Decentralized_FM_alpha/run_infer_opt_175b_collect_sp_data.sh but miss the file of mlp_sp_x_16.mmap. How to download the mmap files? Thanks!

YixinSong-e commented 5 months ago

I saw in the code:

query_layer = torch.nn.Sequential(
     torch.nn.Linear(CONFIG[args.model]['d'], args.D, bias=None),
     torch.nn.Linear(args.D, CONFIG[args.model]['d']*4, bias=None),
 )

It seems that we use small MLP without activation function to predict , which means that the layers are linearly separable?

Hi, it seems that you have successfully run this project. Now I try to run DejaVu/Decentralized_FM_alpha/run_infer_opt_175b_collect_sp_data.sh but miss the file of mlp_sp_x_16.mmap. How to download the mmap files? Thanks!

Actually I rewrite the logical in collect_sp_data. There are some other implemtation you can refer to. https://github.com/Raincleared-Song/DejaVu_predictor

czq693497091 commented 5 months ago

I saw in the code:

query_layer = torch.nn.Sequential(
     torch.nn.Linear(CONFIG[args.model]['d'], args.D, bias=None),
     torch.nn.Linear(args.D, CONFIG[args.model]['d']*4, bias=None),
 )

It seems that we use small MLP without activation function to predict , which means that the layers are linearly separable?

Hi, it seems that you have successfully run this project. Now I try to run DejaVu/Decentralized_FM_alpha/run_infer_opt_175b_collect_sp_data.sh but miss the file of mlp_sp_x_16.mmap. How to download the mmap files? Thanks!

Actually I rewrite the logical in collect_sp_data. There are some other implemtation you can refer to. https://github.com/Raincleared-Song/DejaVu_predictor

Thanks! It really helps!