Model input and output shape

jasonrig / address-net

A package to structure Australian addresses

MIT License

195 stars 86 forks source link

Model input and output shape #10

Closed CharlesAverill closed 4 years ago

CharlesAverill commented 4 years ago

I'm trying to re-implement this in Keras. What's the output shape for this model? Does it output the indeces around text that falls in each category, or something completely different?

jasonrig commented 4 years ago

Hi @CharlesAverill, sorry for the delay in answering this question. The output shape of the model is [number of characters] x [number of classes]. So each character is assigned something like "street number", or "suburb", etc., and characters of the same class are grouped when generating the final output. By the way, if you make your keras implementation public, let me know; I'd love to take a look at your approach.

CharlesAverill commented 4 years ago

@jasonrig

Very basic Keras implementation here: https://github.com/CharlesAverill/Address-Atomization

My public code is usually much cleaner but this went straight to a client. It seems to work pretty well with generated data. If it generates enough interest I'll clean it up and package it for pip use. I really just did the Keras implementation because I find it much easier to understand than anything in TF.

narasimhankrishna commented 4 years ago

@CharlesAverill I ran your code with a simple address set of my own (the address labels are different than what is in the code). It produces a .h5 file and saves to disk. Now how to load and use this model to predict a string of adddress? Can you please help with an example. It will be great. thanks.

CharlesAverill commented 4 years ago

@narasimhankrishna I think I still have a loading script (but no promises lol), I’ll commit it tomorrow morning CST

CharlesAverill commented 4 years ago

@narasimhankrishna So it looks like I actually don't have my loading script anymore. But I believe it mainly comprised of these lines from the training script:

    i = random.randint(0, len(X_test))
    p = model.predict(np.array([X_test[i]]))
    p = np.argmax(p, axis=-1)
    print("{:15} ({:5}): {}".format("Word", "True", "Pred"))
    for w, pred in zip(X_test[i], p[0]):
        print("{:15}: {}".format(words[w], tags[pred]))

Obviously you'll need to load the .h5, then format the address you'd like to atomize as a numpy array in the same format as your training data. I think you'll also need a words list, which I think should contain any words you expect to see while using the model. Not 100% sure because I haven't seen this code in months. Sorry about that. Good luck!

narasimhankrishna commented 4 years ago

Thank you for your instant response. I will try as you indicated. best regards