llSourcell / Make_Money_with_Tensorflow_2.0

This is the code for "Make Money with Tensorflow 2.0" by Siraj Raval
542 stars 226 forks source link

ValueError: Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required by MinMaxScaler. #7

Open JackRossProjects opened 5 years ago

JackRossProjects commented 5 years ago

Trying to train the scalar with training data and smooth data:

smoothing_window_size = 2500 for di in range(0,10000,smoothing_window_size): scaler.fit(train_data[di:di+smoothing_window_size,:]) train_data[di:di+smoothing_window_size,:] = scaler.transform(train_data[di:di+smoothing_window_size,:])

I get this error: ValueError Traceback (most recent call last)

in 2 smoothing_window_size = 2500 3 for di in range(0,10000,smoothing_window_size): ----> 4 scaler.fit(train_data[di:di+smoothing_window_size,:]) 5 train_data[di:di+smoothing_window_size,:] = scaler.transform(train_data[di:di+smoothing_window_size,:]) ~\Anaconda3\lib\site-packages\sklearn\preprocessing\data.py in fit(self, X, y) 306 # Reset internal state before fitting 307 self._reset() --> 308 return self.partial_fit(X, y) 309 310 def partial_fit(self, X, y=None): ~\Anaconda3\lib\site-packages\sklearn\preprocessing\data.py in partial_fit(self, X, y) 332 333 X = check_array(X, copy=self.copy, warn_on_dtype=True, --> 334 estimator=self, dtype=FLOAT_DTYPES) 335 336 data_min = np.min(X, axis=0) ~\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator) 460 " minimum of %d is required%s." 461 % (n_samples, shape_repr, ensure_min_samples, --> 462 context)) 463 464 if ensure_min_features > 0 and array.ndim == 2: ValueError: Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required by MinMaxScaler. Apologies if I'm missing something glaringly obvious but I'm at a loss.
makamkkumar commented 5 years ago

Facing the same issue have you found a work around please

gigerbytes commented 5 years ago

Make sure you download all the data from 1962 from the Yahoo page (set filter to Max & click Apply), otherwise the smoothing window goes over bounds.

The default download is only 20 records.

karan842 commented 3 years ago

I cant understand this error and i also want a solution. Someone please explain !!

nmshafie1993 commented 2 years ago

I am getting the same error, any solution?

izzetcankurt commented 2 years ago

Check this out : https://stackoverflow.com/questions/53421626/valueerror-found-array-with-0-sample-s-shape-0-1-while-a-minimum-of-1-is

BogereMark879 commented 1 year ago

I want to build a flask API that connects to a Flutter mobile application, bellow is the code of the flask api; import pickle from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.metrics.pairwise import cosine_similarity from flask import Flask, request, jsonify from flask_cors import CORS import pandas as pd import numpy as np import nltk import re

app = Flask(name) CORS(app)

Load the model

model = pickle.load(open('similarity1.pkl', 'rb'))

Load the attractions and preferences data

attractions = pd.read_csv(r"C:\Users\Bogere\OneDrive\Desktop\Tourism\tourism_attractions.csv") preferences = pd.read_csv(r"C:\Users\Bogere\OneDrive\Desktop\Tourism\userA_preferences.csv") attractions = attractions[['item_id', 'name', 'experience_tags']]

Normalize the documents

nltk.download('stopwords') stop_words = nltk.corpus.stopwords.words('english') def normalize_document(document): document = re.sub(r'[^a-zA-Z0-9\s]', '', document, re.I | re.A) document = document.lower() document = document.strip() tokens = nltk.word_tokenize(document) filtered_tokens = [token for token in tokens if token not in stop_words] document = ' '.join(filtered_tokens) return document

norm_corpus_attractions = attractions['experience_tags'].apply(normalize_document) norm_corpus_preferences = preferences['preferences'].apply(normalize_document)

Compute the cosine similarity scores

tfidf_vectorizer = TfidfVectorizer(ngram_range=(1, 2), min_df=1) tfidf_matrix_attractions = tfidf_vectorizer.fit_transform(norm_corpus_attractions) tfidf_matrix_preferences = tfidf_vectorizer.transform(norm_corpus_preferences) cosine_similarity_scores = cosine_similarity(tfidf_matrix_attractions, tfidf_matrix_preferences) df_cosine_similarity = pd.DataFrame(cosine_similarity_scores) df_cosine_similarity.index = df_cosine_similarity.index + 1 df_cosine_similarity.index.name = 'item_id' df_cosine_similarity = df_cosine_similarity.rename(columns={0: 'similarity_score'})

@app.route('/', methods=['GET']) def index(): if request.method == 'GET':

Get the selected checkboxes from the user input

    selected_preferences = request.args.getlist('title')

    # Normalize the selected preferences
    norm_selected_preferences = [normalize_document(pref) for pref in selected_preferences]

    # Compute the cosine similarity scores
    tfidf_matrix_selected_preferences = tfidf_vectorizer.transform(norm_selected_preferences)
    cosine_similarity_selected = cosine_similarity(tfidf_matrix_attractions, tfidf_matrix_selected_preferences)
    df_cosine_similarity_selected = pd.DataFrame(cosine_similarity_selected)
    df_cosine_similarity_selected.index = df_cosine_similarity_selected.index + 1
    df_cosine_similarity_selected.index.name = 'item_id'
    df_cosine_similarity_selected = df_cosine_similarity_selected.rename(columns={0: 'similarity_score'})

    # Merge the attractions data with the similarity scores
    attractions_with_similarity_scores = pd.merge(attractions, df_cosine_similarity_selected, on='item_id')

    # Sort the recommendations by similarity score in descending order
    recommendations = attractions_with_similarity_scores.sort_values(by='similarity_score', ascending=False)

    # Select the top N recommendations
    N = 5
    top_recommendations = recommendations['name'].tolist()[:N]

    # Return the recommendations as a JSON object
    return jsonify({'recommendations': top_recommendations})

if name == 'main': app.run(debug=True) but unfortunately the link provided by the flask function gives me the error bellow, how can i solve it, someone to help me please ValueError: Found array with 0 sample(s) (shape=(0, 47)) while a minimum of 1 is required by TfidfTransformer.