rasbt / python-machine-learning-book

The "Python Machine Learning (1st edition)" book code repository and info resource
MIT License
12.24k stars 4.4k forks source link

operands could not be broadcast together with shapes (5,5) (10,5) #8

Closed zhangzfmail closed 8 years ago

zhangzfmail commented 8 years ago

I am trying to execute MLPGradientCheck.fit in chapter 12 But got an error The code is copied from source code in Chapter 12 Could you please tell me where I am wrong and how to fix it?

error

rasbt commented 8 years ago

Sorry to hear that you are having problems with that! I just reran the code from the IPython notebook (https://github.com/rasbt/python-machine-learning-book/tree/master/code/ch12) and it worked fine for me.

nn_check = MLPGradientCheck(n_output=10, 
                            n_features=X_train.shape[1], 
                            n_hidden=10, 
                            l2=0.0, 
                            l1=0.0, 
                            epochs=10, 
                            eta=0.001,
                            alpha=0.0,
                            decrease_const=0.0,
                            minibatches=1, 
                            shuffle=False,
                            random_state=1)

Then

nn_check.fit(X_train[:5], y_train[:5], print_progress=False)

which should yield something like:

Ok: 2.55068505986e-10
Ok: 2.93547837023e-10
Ok: 2.37449571314e-10
Ok: 3.08194323691e-10
Ok: 3.38249440642e-10
Ok: 3.57890221135e-10
Ok: 2.19231256383e-10
Ok: 2.36583740198e-10
Ok: 3.43584860701e-10
Ok: 2.13345208113e-10

When I inspected the input samples and labels, they looked like this:

screen shot 2016-05-27 at 4 38 22 pm

Can you maybe double-check that the X and y arrays have the same dimensions? Maybe something went wrong during parsing MNIST.

rasbt commented 8 years ago

In #9, you mentioned that a3 is practically always a "nan" matrix. Hm, I think the easiest thing to check is the input arrays X and y. Can you please print the X_train and y_train you use to fit the model? I think that will give us some useful clues! Also, have you tried to run the IPython notebook (ch12) from this GitHub repo? I am curious if it's maybe an OS-related issue or if there may be a typo in the script. Let me know how it goes! :)

zhangzfmail commented 8 years ago

Hi: Thank you for the quick response I just check the X_train and y_train, it seemingly works fine

problem

rasbt commented 8 years ago

Hm, I am thinking that it could be a floating point precision problem. Would be nice if you could run the IPython notebook from this repository. It works fine for me, and it worked for the other readers as well. So, if the IPython notebook doesn't run properly on your system, I guess we could narrow the problem down further and look at floating point precision and these things

zhangzfmail commented 8 years ago

I ran IPython noteook and the result is the same

error

rasbt commented 8 years ago

Sorry, I have never seen this problem before. I will run it on my other Linux machine tomorrow to see if I get the same prob. If you execute

>>> import numpy as np
>>> np.finfo(np.float).precision

does is return 15 or a value higher than that?

zhangzfmail commented 8 years ago

Thank you in advance for your help I execute the code and it return 15

problem

rasbt commented 8 years ago

oh, that's very useful info! I just see that you are running the code on Python 2.7 instead of Python 3. I guess that's what causes the problem (I will check it out on Python 2.7 later and edit the code to make it compatible)

zhangzfmail commented 8 years ago

Thank you

rasbt commented 8 years ago

Hm, it worked fine for me when I ran it via Python 2.7.9 and it worked fine on my linux machine (CentOS) as well. Sorry that I don't have a good solution for you at hand ...

a) I am curious if this is Python 2.7 related, does it work for you if run the notebook via Python 3.5?

b) If you are using Python 2.7, maybe an additional

from __future__ import division

could help in your case (however, like mentioned before, it worked fine for me on Py27 without it)

c) can you please try to run

nn_check.fit(X_train[:5].astype(float), y_train[:5], print_progress=False)?

d) Btw. did the rest of the code that came before gradient checking for okay on your machine?

zhangzfmail commented 8 years ago

Thank you for your information I tried the code you provided, but the problem is the same By the way, the code before gradient checking seemingly works fine

problem

rasbt commented 8 years ago

Hm, I am really curious to find out what causes this on your machine. Would you mind attaching the .py script here so that I can run it on my different machines?

In addition, it would be helpful to know which Python version and package versions you are using so that I can recreate your environment. E.g., by executing

$ python -V
Python 3.5.1 :: Continuum Analytics, Inc.
$ python -c 'import numpy; print(numpy.__version__)'
1.11.0
$ python -c 'import numpy; print(scipy.__version__)'
0.17.0
zhangzfmail commented 8 years ago

Please remove the extension ".txt", before use it

By the way, the information you need is as follows:

problem

MLPGradientCheck.py.txt mynewpic.py.txt

rasbt commented 8 years ago

Thanks, I found the problem! When I installed SciPy 1.8.2 and NumPy 0.13.3, I got the same problem as you did. However, when I upgraded the packages to NumPy 1.9.3 and SciPy 0.14.0 it works just fine.

Although I listed the minimum version requirements on page 15 (end of chapter 1), I'd even recommend to install the latest versions, e.g., NumPy 1.11.0 and SciPy 0.17.0.

screen shot 2016-05-30 at 1 25 40 pm

zhangzfmail commented 8 years ago

Thank you for the important information

rasbt commented 8 years ago

You are welcome. Sorry, but the package updates seem to be the only solution to this problem. But aside from this chapter code, I would recommend it anyway since many bugs and problems have been fixed in the latest NumPy and SciPy releases -- I'd always try to stay up to date with these packages.

atanumandal0491 commented 5 years ago

Is this problem related to this?

python recognize.py --file p364_001.wav Traceback (most recent call last): File "recognize.py", line 53, in mfcc = np.transpose(np.expand_dims(librosa.feature.mfcc(wav, 16000), axis=0), [0, 2, 1]) File "/usr/local/lib/python2.7/dist-packages/librosa/feature/spectral.py", line 1279, in mfcc S = power_to_db(melspectrogram(y=y, sr=sr, kwargs)) File "/usr/local/lib/python2.7/dist-packages/librosa/feature/spectral.py", line 1371, in melspectrogram mel_basis = filters.mel(sr, n_fft, kwargs) File "/usr/local/lib/python2.7/dist-packages/librosa/filters.py", line 238, in mel lower = -ramps[i] / fdiff[i] ValueError: operands could not be broadcast together with shapes (1,1025) (0,)

YashBangera7 commented 5 years ago

Is this problem related to this?

python recognize.py --file p364_001.wav Traceback (most recent call last): File "recognize.py", line 53, in mfcc = np.transpose(np.expand_dims(librosa.feature.mfcc(wav, 16000), axis=0), [0, 2, 1]) File "/usr/local/lib/python2.7/dist-packages/librosa/feature/spectral.py", line 1279, in mfcc S = power_to_db(melspectrogram(y=y, sr=sr, kwargs)) File "/usr/local/lib/python2.7/dist-packages/librosa/feature/spectral.py", line 1371, in melspectrogram mel_basis = filters.mel(sr, n_fft, kwargs) File "/usr/local/lib/python2.7/dist-packages/librosa/filters.py", line 238, in mel lower = -ramps[i] / fdiff[i] ValueError: operands could not be broadcast together with shapes (1,1025) (0,)

did u get the fix to this? if yes , please help me. I need some help urgently

atanumandal0491 commented 5 years ago

What you wanted to do?

rasbt commented 5 years ago

I am just seeing in the error output that you are using Python 2.7, it could be also related to this because I am not sure if recent versions of NumPy and scikit-learn support Python 2.7 properly anymore.

atanumandal0491 commented 5 years ago

I was working on audio to text where I came into the issues. Later I found that due to version mismatch of different packages required. I suggest you to work with latest packages. The error was due to the division where denominator is giving 0

mady143 commented 4 years ago

@rasbt @zhangzfmail @atanumandal0491 @YashBangera7 @SumitBando , Hi I am getting an error like ValueError: operands could not be broadcast together with shapes (2336,122) (121,) i am using python3 and scikit-learn==0.22.1 could any help to resolve this issue.

Thanks and Regards, Manikantha Sekhar.

rasbt commented 4 years ago

Hi there,

could you share in which chapter this is happening? Is it an error in Ch12 similar to this original issue?

mady143 commented 4 years ago

@rasbt ,

1

from this bellow code: from flask import Flask,flash,render_template,session, request,redirect,url_for

import PyPDF2

import docx2txt

import pandas as pd

from IPython.display import Markdown, display, clear_output

import _pickle as cPickle

from pathlib import Path

import gensim from gensim.test.utils import datapath, get_tmpfile from gensim.models import KeyedVectors from gensim.scripts.glove2word2vec import glove2word2vec

import random

import os

app = Flask(name)

app.secret_key = "super secret key"

@app.route('/') def home():

return render_template('index.html')

@app.route('/dict_output',methods = ['POST']) def render(): if request.method == "POST": file = request.form['file']

    file_name,file_ext = os.path.splitext(file)

    if file_ext == '.pdf':
        data = ""
        pdfFileObj = open(file, 'rb')
        pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
        for page in range(pdfReader.numPages):
            # print("page:",page)
            pageObj = pdfReader.getPage(page)
            extracted_data = pageObj.extractText()
            data += extracted_data

    else:
        textfile = open(file,'r')
        data = textfile.read()

final = {}
def dumpPickle(fileName, content):

    pickleFile = open(fileName, 'wb')
    cPickle.dump(content, pickleFile, -1)
    pickleFile.close()

def loadPickle(fileName):

    file = open(fileName, 'rb')

    content = cPickle.load(file)
    print("content:",content)
    file.close()

    return content

def pickleExists(fileName):
    file = Path(fileName)

    if file.is_file():
        return True

    return False

import spacy
from spacy import displacy
nlp = spacy.load('en_core_web_sm')

#Extract answers and the sentence they are in
def extractAnswers(qas, doc):
    # print("qas:",qas)
    answers = []

    senStart = 0
    senId = 0

    for sentence in doc.sents:
        senLen = len(sentence.text)

        for answer in qas:
            answerStart = answer['answers'][0]['answer_start']

            if (answerStart >= senStart and answerStart < (senStart + senLen)):
                answers.append({'sentenceId': senId, 'text': answer['answers'][0]['text']})

        senStart += senLen
        enId += 1

    return answers

#TODO - Clean answers from stopwords?
def tokenIsAnswer(token, sentenceId, answers):
    for i in range(len(answers)):
        if (answers[i]['sentenceId'] == sentenceId):
            if (answers[i]['text'] == token):
                return True
    return False

#Save named entities start points

def getNEStartIndexs(doc):
    neStarts = {}
    for ne in doc.ents:
        neStarts[ne.start] = ne

    return neStarts 

def getSentenceStartIndexes(doc):
    senStarts = []

    for sentence in doc.sents:
        senStarts.append(sentence[0].i)

    return senStarts

def getSentenceForWordPosition(wordPos, senStarts):
    for i in range(1, len(senStarts)):
        if (wordPos < senStarts[i]):
            return i - 1

def addWordsForParagrapgh(newWords, text):
    doc = nlp(text)

    neStarts = getNEStartIndexs(doc)
    senStarts = getSentenceStartIndexes(doc)

    #index of word in spacy doc text
    i = 0

    while (i < len(doc)):
        #If the token is a start of a Named Entity, add it and push to index to end of the NE
        if (i in neStarts):
            word = neStarts[i]
            #add word
            currentSentence = getSentenceForWordPosition(word.start, senStarts)
            wordLen = word.end - word.start
            shape = ''
            for wordIndex in range(word.start, word.end):
                shape += (' ' + doc[wordIndex].shape_)

            newWords.append([word.text,
                            0,
                            0,
                            currentSentence,
                            wordLen,
                            word.label_,
                            None,
                            None,
                            None,
                            shape])
            i = neStarts[i].end - 1
        #If not a NE, add the word if it's not a stopword or a non-alpha (not regular letters)
        else:
            if (doc[i].is_stop == False and doc[i].is_alpha == True):
                word = doc[i]

                currentSentence = getSentenceForWordPosition(i, senStarts)
                wordLen = 1

                newWords.append([word.text,
                                0,
                                0,
                                currentSentence,
                                wordLen,
                                None,
                                word.pos_,
                                word.tag_,
                                word.dep_,
                                word.shape_])
        i += 1

def oneHotEncodeColumns(df):
    columnsToEncode = ['NER', 'POS', "TAG", 'DEP']

    for column in columnsToEncode:
        one_hot = pd.get_dummies(df[column])
        one_hot = one_hot.add_prefix(column + '_')

        df = df.drop(column, axis = 1)
        df = df.join(one_hot)

    return df
def generateDf(text):
    words = []
    addWordsForParagrapgh(words, text)

    wordColums = ['text', 'titleId', 'paragrapghId', 'sentenceId','wordCount', 'NER', 'POS', 'TAG', 'DEP','shape']
    df = pd.DataFrame(words, columns=wordColums)

    return df
def prepareDf(df):

    #One-hot encoding
    wordsDf = oneHotEncodeColumns(df)

    #Drop unused columns
    columnsToDrop = ['text', 'titleId', 'paragrapghId', 'sentenceId', 'shape']
    wordsDf = wordsDf.drop(columnsToDrop, axis = 1)

    #Add missing colums 
    predictorColumns = ['wordCount','NER_CARDINAL','NER_DATE','NER_EVENT','NER_FAC','NER_GPE','NER_LANGUAGE','NER_LAW','NER_LOC','NER_MONEY','NER_NORP','NER_ORDINAL','NER_ORG','NER_PERCENT','NER_PERSON','NER_PRODUCT','NER_QUANTITY','NER_TIME','NER_WORK_OF_ART','POS_ADJ','POS_ADP','POS_ADV','POS_CCONJ','POS_DET','POS_INTJ','POS_NOUN','POS_NUM','POS_PART','POS_PRON','POS_PROPN','POS_PUNCT','POS_SYM','POS_VERB','POS_X','TAG_''','TAG_-LRB-','TAG_.','TAG_ADD','TAG_AFX','TAG_CC','TAG_CD','TAG_DT','TAG_EX','TAG_FW','TAG_IN','TAG_JJ','TAG_JJR','TAG_JJS','TAG_LS','TAG_MD','TAG_NFP','TAG_NN','TAG_NNP','TAG_NNPS','TAG_NNS','TAG_PDT','TAG_POS','TAG_PRP','TAG_PRP$','TAG_RB','TAG_RBR','TAG_RBS','TAG_RP','TAG_SYM','TAG_TO','TAG_UH','TAG_VB','TAG_VBD','TAG_VBG','TAG_VBN','TAG_VBP','TAG_VBZ','TAG_WDT','TAG_WP','TAG_WRB','TAG_XX','DEP_ROOT','DEP_acl','DEP_acomp','DEP_advcl','DEP_advmod','DEP_agent','DEP_amod','DEP_appos','DEP_attr','DEP_aux','DEP_auxpass','DEP_case','DEP_cc','DEP_ccomp','DEP_compound','DEP_conj','DEP_csubj','DEP_csubjpass','DEP_dative','DEP_dep','DEP_det','DEP_dobj','DEP_expl','DEP_intj','DEP_mark','DEP_meta','DEP_neg','DEP_nmod','DEP_npadvmod','DEP_nsubj','DEP_nsubjpass','DEP_nummod','DEP_oprd','DEP_parataxis','DEP_pcomp','DEP_pobj','DEP_poss','DEP_preconj','DEP_predet','DEP_prep','DEP_prt','DEP_punct','DEP_quantmod','DEP_relcl','DEP_xcomp']

    for feature in predictorColumns:
        if feature not in wordsDf.columns:
            wordsDf[feature] = 0

    return wordsDf
def predictWords(wordsDf, df):

    predictorPickleName = 'data/pickles/nb-predictor.pkl'
    predictor = loadPickle(predictorPickleName)
    print("predictor:",predictor)
    y_pred = predictor.predict_proba(wordsDf)
    print("y_pred:",y_pred)
    labeledAnswers = []
    for i in range(len(y_pred)):
        labeledAnswers.append({'word': df.iloc[i]['text'], 'prob': y_pred[i][0]})

    return labeledAnswers
def blankAnswer(firstTokenIndex, lastTokenIndex, sentStart, sentEnd, doc):
    leftPartStart = doc[sentStart].idx
    leftPartEnd = doc[firstTokenIndex].idx
    rightPartStart = doc[lastTokenIndex].idx + len(doc[lastTokenIndex])
    rightPartEnd = doc[sentEnd - 1].idx + len(doc[sentEnd - 1])

    question = doc.text[leftPartStart:leftPartEnd] + '_____' + doc.text[rightPartStart:rightPartEnd]

    return question
def addQuestions(answers, text):
    doc = nlp(text)
    currAnswerIndex = 0
    qaPair = []

    #Check wheter each token is the next answer
    for sent in doc.sents:
        for token in sent:

            #If all the answers have been found, stop looking
            if currAnswerIndex >= len(answers):
                break

            #In the case where the answer is consisted of more than one token, check the following tokens as well.
            answerDoc = nlp(answers[currAnswerIndex]['word'])
            answerIsFound = True

            for j in range(len(answerDoc)):
                if token.i + j >= len(doc) or doc[token.i + j].text != answerDoc[j].text:
                    answerIsFound = False

            #If the current token is corresponding with the answer, add it 
            if answerIsFound:
                question = blankAnswer(token.i, token.i + len(answerDoc) - 1, sent.start, sent.end, doc)

                qaPair.append({'question' : question, 'answer': answers[currAnswerIndex]['word'], 'prob': answers[currAnswerIndex]['prob']})

                currAnswerIndex += 1

    return qaPair
def sortAnswers(qaPairs):
    orderedQaPairs = sorted(qaPairs, key=lambda qaPair: qaPair['prob'])

    return orderedQaPairs

glove_file = 'data/embeddings/glove.6B.300d.txt'
tmp_file = 'data/embeddings/word2vec-glove.6B.300d.txt'

glove2word2vec(glove_file, tmp_file)
model = KeyedVectors.load_word2vec_format(tmp_file)

def generate_distractors(answer,count):
    # print("Answer:",answer)

    # count = 3
    # print("Count:",count)
    answer = str.lower(answer)

    ##Extracting closest words for the answer. 
    try:
        closestWords = model.most_similar(positive=[answer], topn=count)
        # print("ClosestWords:",closestWords)
    except:
        #In case the word is not in the vocabulary, or other problem not loading embeddings
        return []

    #Return count many distractors
    distractors = list(map(lambda x: x[0], closestWords))[0:count]
    # print("distractors:",distractors)
    return distractors
def addDistractors(qaPairs, count):

    for qaPair in qaPairs:
        distractors = generate_distractors(qaPair['answer'], count)

        qaPair['distractors'] = distractors

    return qaPairs
def generateQuestions(text, count):
    # print("text:",text)
    # Extract words 
    df = generateDf(text)
    # print("DF:",df)
    wordsDf = prepareDf(df)
    # print("wordsdf:",wordsDf)
    # print("DF:",df)
    # Predict 
    labeledAnswers = predictWords(wordsDf, df)

    # Transform questions
    qaPairs = addQuestions(labeledAnswers, text)

    # Pick the best questions
    orderedQaPairs = sortAnswers(qaPairs)

    # Generate distractors
    questions = addDistractors(orderedQaPairs[:count], 3)
    # print("QQQQQQQQQ:",questions)

    for i in range(count):
        dic1 = {}
        dic2 = {}

        questions[i]['distractors'].append(questions[i]['answer'])

        options = questions[i]['distractors']
        random.shuffle(options)

        dic1['A--Question'] = questions[i]['question']
        dic1['B--Answer'] = questions[i]['answer']
        # dic2[i] = dic1
        list1 = []
        for distractor in options:
            list1.append(distractor)
        # print("LIST1:",list1)
        dic1['C--Options'] = list1
        final[i] = dic1
        # print("final:",final)

    return final
generateQuestions(data,5)

return render_template("output.html",final_data = final)

if name == "main": app.run(debug=True)

  1. while trying to upload txt file i am getting the output as like 2

  2. while trying to upload pdf file i am getting the output like 3

I am little confused to find out the issue i am getting could you help me if you need more information i will help you

Thanks & Regards, Manikantha Sekhar.

rasbt commented 4 years ago

Hi there,

it doesn't look like any code from the book?