Prepare clip.cpp for upcoming llava.cpp

I'm still not 100% sure whether to call it llava.cpp or by another name to indicate its future support for other multimodal generation models in the future --maybe multimodal.cpp or lmm.cpp (large multimodal model). Open to suggestions by let's call it llava.cpp with a code name.

Update CMakeLists.txt with a flag CLIP_STANDALONE to toggle standalone mode. When ON, build against the ggml submodule. When OFF, build with ggml.h and ggml.c files directly included in llama.cpp.
Implement a function to get hidden states from a given layer index, to be used in llava.cpp.
Create another repo for llava.cpp. the llava.cpp repo should add both clip.cpp and llama.cpp repos as submodules and build with CLIP_STANDALONE=OFF to build against ggml sources included in llama.cpp.

monatis / clip.cpp

Prepare clip.cpp for upcoming llava.cpp #31