Open head-iie-vnr opened 2 days ago
Google Colab has certain quota limits to ensure fair usage and resource availability for all users. These limits can vary, especially for free users, and they may change over time. Here are some of the key quota limits:
Runtime Duration:
Daily Usage Limits:
Hardware Specifications:
Disk Space:
Google offers paid plans called Colab Pro and Colab Pro+ that provide additional resources and higher limits:
Colab Pro:
Colab Pro+:
It's important to be aware that these limits are subject to change, and it's a good idea to check the Google Colab FAQ or the Google Colab Pro page for the latest information.
Working with Large Language Models (LLMs) in Python involves using various libraries that facilitate model implementation, data handling, and evaluation. Here are some of the most popular libraries:
Usage:
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
# Example: Text generation
generator = pipeline('text-generation', model='gpt-2')
result = generator("Once upon a time", max_length=50)
print(result)
Usage:
import tensorflow as tf
from transformers import TFAutoModel, AutoTokenizer
# Example: Loading a pre-trained model
model = TFAutoModel.from_pretrained("bert-base-uncased")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
Usage:
import torch
from transformers import AutoModel, AutoTokenizer
# Example: Loading a pre-trained model
model = AutoModel.from_pretrained("bert-base-uncased")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
Usage:
from datasets import load_dataset
# Example: Loading a dataset
dataset = load_dataset('imdb')
print(dataset['train'][0])
Usage:
import numpy as np
# Example: Creating an array
array = np.array([1, 2, 3, 4, 5])
print(array)
Usage:
import pandas as pd
# Example: Creating a DataFrame
data = {'name': ['Alice', 'Bob'], 'age': [25, 30]}
df = pd.DataFrame(data)
print(df)
Usage:
import matplotlib.pyplot as plt
import seaborn as sns
# Example: Plotting data
sns.set(style="whitegrid")
tips = sns.load_dataset("tips")
sns.boxplot(x="day", y="total_bill", data=tips)
plt.show()
Usage:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
# Example: Training a model
X, y = np.arange(10).reshape((5, 2)), range(5)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = LogisticRegression().fit(X_train, y_train)
Usage:
import nltk
from spacy import load
# NLTK example: Tokenizing text
nltk.download('punkt')
from nltk.tokenize import word_tokenize
tokens = word_tokenize("Hello, how are you?")
print(tokens)
# spaCy example: Named Entity Recognition
nlp = load("en_core_web_sm")
doc = nlp("Apple is looking at buying U.K. startup for $1 billion")
for ent in doc.ents:
print(ent.text, ent.label_)
These libraries provide a comprehensive toolkit for working with large language models and performing a wide range of machine learning and data processing tasks.
https://colab.research.google.com/drive/1wrdxxv1aczuFdjRpHREKxo8dx1LPfM3a#scrollTo=OoGw2U0696Ip
!pip install transformers torch
# Import necessary libraries
from transformers import pipeline
# Initialize the sentiment analysis pipeline with a specific model
senti_analysis = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")
# Define the list of texts
texts = [
"I do not like to eat cake, but I like the smell",
"I like cake smell, but I do not like to eat it",
"I like the cake",
"I do not like the smell"
]
# Perform sentiment analysis on the texts
results = senti_analysis(texts)
# Print the results
for text, result in zip(texts, results):
print(f"Text: {text}\nSentiment: {result['label']}, Score: {result['score']:.4f}\n")
By specifying both the task and the model, you ensure that the pipeline is configured correctly to provide reliable sentiment analysis results for your input texts.
pipeline
function for initializing the sentiment analysis"sentiment-analysis"
model="distilbert-base-uncased-finetuned-sst-2-english"
"distilbert-base-uncased-finetuned-sst-2-english"
is a specific model hosted on Hugging Face's Model Hub. It is a fine-tuned version of the DistilBERT model, which is a smaller and faster variant of BERT (Bidirectional Encoder Representations from Transformers).The specified model and task work together to analyze the sentiment of input texts. Here is how the pipeline utilizes these parameters:
distilbert-base-uncased-finetuned-sst-2-english
model, which is optimized for understanding and classifying sentiment in English text.Output generated
Text: I do not like to eat cake, but I like the smell Sentiment: POSITIVE, Score: 0.9992
Text: I like cake smell, but I do not like to eat it Sentiment: POSITIVE, Score: 0.5582
Text: I like the cake Sentiment: POSITIVE, Score: 0.9997
Text: I do not like the smell Sentiment: NEGATIVE, Score: 0.9969
https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english
The above page shows more details about the model. It contains details like
https://huggingface.co/models?pipeline_tag=text-generation&sort=trending
sentiment-analysis : is under Task type 'Text-Classification'
Google Colab (short for "Colaboratory") is a free cloud-based platform provided by Google that allows users to write and execute Python code in a Jupyter notebook environment. It is particularly popular among data scientists, machine learning practitioners, and educators for several reasons:
Cloud-Based: Since it is hosted on the cloud, you do not need to install any software on your local machine. You can access your notebooks from any device with an internet connection.
Free Access to GPUs and TPUs: Google Colab offers free access to powerful hardware accelerators like GPUs (Graphics Processing Units) and TPUs (Tensor Processing Units), which significantly speed up the execution of complex computations, especially those involved in deep learning and large-scale data processing.
Jupyter Notebook Interface: The interface is similar to Jupyter Notebooks, which means you can write and run Python code in cells, visualize data with plots, and integrate Markdown for documentation.
Integration with Google Drive: You can easily save and manage your notebooks in your Google Drive, enabling seamless collaboration and sharing.
Pre-Installed Libraries: Google Colab comes pre-installed with many popular Python libraries for data science and machine learning, such as TensorFlow, Keras, PyTorch, Pandas, NumPy, and many more.
Collaboration: Multiple users can work on the same notebook simultaneously, making it an excellent tool for collaborative projects and teaching.
Easy to Share: You can share your notebooks with others via a simple link, and they can view or even edit the notebook depending on the permissions you set.
Google Colab is widely used for tasks such as prototyping machine learning models, conducting exploratory data analysis, and teaching programming and data science concepts.