Closed CharlesAverill closed 4 years ago
Hi @CharlesAverill, sorry for the delay in answering this question. The output shape of the model is [number of characters] x [number of classes]. So each character is assigned something like "street number", or "suburb", etc., and characters of the same class are grouped when generating the final output. By the way, if you make your keras implementation public, let me know; I'd love to take a look at your approach.
@jasonrig
Very basic Keras implementation here: https://github.com/CharlesAverill/Address-Atomization
My public code is usually much cleaner but this went straight to a client. It seems to work pretty well with generated data. If it generates enough interest I'll clean it up and package it for pip use. I really just did the Keras implementation because I find it much easier to understand than anything in TF.
@CharlesAverill I ran your code with a simple address set of my own (the address labels are different than what is in the code). It produces a .h5 file and saves to disk. Now how to load and use this model to predict a string of adddress? Can you please help with an example. It will be great. thanks.
@narasimhankrishna I think I still have a loading script (but no promises lol), I’ll commit it tomorrow morning CST
@narasimhankrishna So it looks like I actually don't have my loading script anymore. But I believe it mainly comprised of these lines from the training script:
i = random.randint(0, len(X_test))
p = model.predict(np.array([X_test[i]]))
p = np.argmax(p, axis=-1)
print("{:15} ({:5}): {}".format("Word", "True", "Pred"))
for w, pred in zip(X_test[i], p[0]):
print("{:15}: {}".format(words[w], tags[pred]))
Obviously you'll need to load the .h5, then format the address you'd like to atomize as a numpy array in the same format as your training data. I think you'll also need a words list, which I think should contain any words you expect to see while using the model. Not 100% sure because I haven't seen this code in months. Sorry about that. Good luck!
Thank you for your instant response. I will try as you indicated. best regards
I'm trying to re-implement this in Keras. What's the output shape for this model? Does it output the indeces around text that falls in each category, or something completely different?