KoboldAI / KoboldAI-Client

For GGUF support, see KoboldCPP: https://github.com/LostRuins/koboldcpp
https://koboldai.com
GNU Affero General Public License v3.0
3.46k stars 747 forks source link

NSFW models no longer showing up on Colab #330

Open Tynach opened 1 year ago

Tynach commented 1 year ago

I'm not sure if this is technically a bug, a wish list, or just a question, but a few hours ago a couple of commits were made that are labeled as merely cleaning up the model lists.. But actually remove all NSFW models from the colab files, and all mention of NSFW models having ever been there.

The commits in question are 148f900 and c11a269.

I would have thought that if Google requested NSFW models be removed, that would have simply been included in the commit message as an explanation. But the commit message is blank, and the title makes it seem like it was just a cleanup (which, to be fair, there were some models that were now defunct, like the 'Lit V2' model).

I tried looking under both 'Issues' and 'Discussions', but didn't find anything about it.. So I'm just kinda confused. Why was this change made?

henk717 commented 1 year ago

I had to rush this change last night after the community spotted this new policy specifically the last point.

https://policies.google.com/terms/generative-ai/use-policy

Google has been cracking down on UI's lately and even banned some models like Pygmalion, so to protect the notebook I removed all references and did so with an obscure message to make sure I didn't introduce a NSFW reference in the commit history of the notebooks.

The notebook does allow manually typing a modelname in the model field, but doing so is your own risk and reaponsibility. Google has banned accounts for specific models in the past.

Tynach commented 1 year ago

@henk717 thanks for the response! However, I think you might have missed something about that link. It's explicitly talking about this service:

https://blog.google/products/search/generative-ai-search/

I don't think it applies to Colab (though I could be wrong). I do remember when Pygmalion was banned from Colab, but I'd been lead to believe (by people in the Pygmalion subreddit) that it was because many people were abusing Colab by using a combination of proxies and additional accounts just to keep running Pygmalion so they could have a companion to chat with all day.

While I'm not really sure of the validity of that claim, I do know that on the Horde, the most popular model by far is Pygmalion. Like, there's no comparison, with it usually having a minimum of 5 instances at any given time, and I've seen as many as 15 going all at once.. And it's still slow because of how many people are using it. I guess people really like their AI chatbots.


Anyway, I'm not sure how to make that last thing you said work. What would I type in the model name field? A model name wouldn't tell it where to download the model from, so I'd have to provide that somehow.. I could edit the Python code to include it, but that seems like the 'wrong' way to do it.

Or am I missing something obvious?

Tynach commented 1 year ago

@henk717, I just found this link:

https://policies.google.com/terms/generative-ai

I simply took your link and removed the /use-policy part at the end. The very first thing it says under 'Use Restrictions' is, "You may not use the Services to develop machine learning models or related technology." I'm pretty sure that's the entire point of using Colab, so I really don't think that page, nor the page you linked to, apply to Colab at all.

henk717 commented 1 year ago

Its very possible and you are probably rght, but it will not make me change my mind on this. I am aware that the policy probably does not apply to colab at this moment but I made the change ahead of the internal memo in case colab follows suit.

When they check the notebook on its use and they don't see NSFW models the risk of both us and the models getting banned is much lower. People wanting to use these models know which models they want to use and I have not added any blocks. So if you do still want to use them you can, you just need to know the models full name on Huggingface and use that as the input if you want to take this responsibility yourself.

Derpford commented 1 year ago

Conveniently, there's links to each model's huggingface page in the readme, and those pages have a button to put the full model name into your clipboard. It's maybe a couple extra clicks.

I wish we didn't have to deal with this; AI-generated stories are arguably the least harmful form that porn can take. Unfortunately, until running GPT at home stops being a thing you need high-end hardware for, and starts being a thing mid-to-low-end consumer hardware can do, we just have to deal with the possibility that Google will throw a fit...

Tynach commented 1 year ago

@henk717 That's entirely fair, and after looking through more links I think I found I was partly wrong anyway. The same day you made the change, the ToS for paid Colab subscriptions changed, and that change was mostly to include the 'Generative AI' terms as something that applies to Colab as well... If you pay a subscription fee.

They probably don't want their higher-tier resources being used to generate NSFW content when those resources could instead be allocated to more important/higher priority things. They might also not want their own high-tier resources being used to create or develop competing platforms, but they're okay with lower-tiered resources being used for that purpose.

Source: The Colab FAQ mentions:

Additional restrictions exist for paid users here.

That link currently redirects to this link, which it says was updated yesterday on the 10th. On that page it says (emphasis mine):

To use the paid Colab Services (the “Colab Paid Services”), you must accept (1) the Google Terms of Service, and (2) these Colab Paid Services Additional Terms of Service (the “Colab Paid Services Additional Terms” or “Contract”), and (3) Google's Generative AI Terms of Services.

That page also has a link to a list of previous versions of that page, and on said previous versions the part in bold is not present.

All that said, it would not surprise me if they decided to have these changes also affect non-paid users in the future. Heck, it might be that they just haven't added the necessary pages to make that policy change apparent yet, and that's already the case.

Either way, I did manage to get a model from HuggingFace working by name, so I have absolutely no more complaints about any of this. I'd say this issue can probably be closed now.

henk717 commented 1 year ago

Ill leave it open for exposure so people know why the change was made and what alternative options are. I also want to point for @Derpford that local is getting better and better, if you download https://koboldai.org/cpp you have a CPU optimized version of Kobold which works on GGML models you can also search on https://huggingface.co with pretty good speeds for a CPU.

Derpford commented 1 year ago

@henk717 I think there's a way to pin an issue to the top of the issue page? That might help.

Will have to experiment with the CPU-optimized version of KoboldAI. I'm running a low-end first gen Ryzen--so, not top-of-the-line equipment, but not terrible either. If I can get decent perf out of it, I may just stop using Colab altogether.

Derpford commented 1 year ago

Update: Turns out 32gb is not enough to run both an Erebus GGML adaptation and Windows. This is mildly frustrating.

henk717 commented 1 year ago

It should fit since Erebus 20B is in the 16GB - 20GB territory. Just make sure you dont have to much open and don't use the fp16 version of the model.

Derpford commented 1 year ago

I was using the q5_1 file; the problem is that, near as I can tell, Windows and Discord eats 8gb by itself and having a single Edge tab open to use the interface eats another 4gb. That leaves exactly 20GB left for everything else...which is just barely not enough to store both the model and the prompt. So now everything's waiting on swap, which I really shouldn't rely on because it's probably gonna murder my SSD...

henk717 commented 1 year ago

On my system windows uses 4GB idle and thats already on the high side since my VM's only use 2.5GB idle so if Windows by itself is using 8GB that would be very suspicious since none of the hundreds of machines I maintain for my day job do that.

Koboldcpp by default wont touch your swap, it will just stream missing parts from disk so its read only not writes. But its almost certainly other memory hungry background processes you have going getting in the way.

Derpford commented 1 year ago

It looks like swap usage to me; memory usage and disk usage both capped. I might set up my linux partition again and see if I can get it running more efficiently there.

henk717 commented 1 year ago

The program is more intelligent than that as long as --nommap is not used, it will map the file on the drive to a memory region instead of loading it all in memory in a traditional way. So when you run out of ram it will discard those parts of the model and stream them from disk. Leaving your pagefiles alone. That does have the side effect of other software not shrinking itself in memory usage because we are such a friendly neighbor. I can load 65B models and my pagefile will not increase in size with this method.

There are two things you can try the first is --mlock or --nommap which will change that behavior to where it becomes more aggressive and claims the memory for itself in full. This will force other software to give way and shrink, but at the expense that it can now start hitting your swap file if you don't manage.

The other way is using a third party utility such as https://www.majorgeeks.com/files/details/memory_cleaner_danskee.html to clear the memory of your other software very aggressively prior to launching Koboldcpp so you have more memory for it.

Derpford commented 1 year ago

So, the weird thing is that the model file is on my D drive but the disk usage is all on C. I have my D drive mounted to a folder under C, but--testing with another program, when something's loaded from the D drive, it shows disk usage on D. I'm going to go open an issue in koboldcpp's repo.

[EDIT] Apparently I'm a couple versions behind, but I don't feel like closing everything again to test the new version. Will get back to it in a few days.