Owen-Oertell / rlcm

43 stars 2 forks source link

prompt image alignment experiment LLaVA-server #1

Open jeeyung opened 4 months ago

jeeyung commented 4 months ago

Hello!

Could you please elaborate how you set up LLaVA-server? I am struggling with using the server.

Owen-Oertell commented 4 months ago

Yeah LLaVA-server is a bit tricky to get working. You need to use this repository from Kevin Black. Is there something specific that you are unsure about?

The main thing is just to install the pyproject.toml. The other thing that I recommend is making this in a different conda environment since I believe that this requires a different version of transformers than RLCM does.

Oh I see that you had OOM -- I think it's worth trying 7b and seeing if you can get that to work. Also you could quantize the model more. The GPUs that I used had 48gb of memory (nvidia a6000s)

jeeyung commented 4 months ago

Hi, Thank you for the quick response.

Actually, I encountered many issues including OOM, out of index error for embedding matrix, no config for LLaVA etc... Even though I made it running with several tricks from here and there, but it results in 0 bert score many times.

As one example of errors, didn't you encounter the issue below related to llavaconfig?

image