Open impactcolor opened 7 years ago
You should be able to generate numbers like:
python generate.py --text="1 2 3 4 5 " --noinfo --bias=4.
although the quality will probably be quite bad (too little examples in dataset).
You can add your own examples in .xml
format but you will have to match them to those already in dataset (content should contain tags like: <Transcription>
, <Text>
and <StrokeSet>
, structured like in dataset).
Alternatively if you have data with consecutive points representing how to draw numbers (with labels) you could create your own dataset.
So depending on format of your dataset it might be easier or harder. :)
I'm really new to this so I'm not sure how to go about creating a dataset. Do you have any articles or direction you can point me to?
Sorry for the delay. I get the feeling you have no data, which is problematic. Could you please elaborate a little bit more on what you are trying to achieve? :)
It's no problem, thank you for taking the time to even discuss this with me. I found a dataset which of numerically written numbers however it isn't setup as the current dataset used by IAM in xml files. What I'm trying to accomplish is to use the handwriting but it also has to include numbers and currently the numbers do not come out good.
On Fri, Oct 20, 2017 at 6:06 AM, Grzegorz Opoka notifications@github.com wrote:
Sorry for the delay. I get the feeling you have no data, which is problematic. Could you please elaborate a little bit more on what you are trying to achieve? :)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Grzego/handwriting-generation/issues/2#issuecomment-338200995, or mute the thread https://github.com/notifications/unsubscribe-auth/AEQOknAGNyvv2VlG7lkOJuE9BNydaJKOks5suJrygaJpZM4P-NV6 .
Ok, is this dataset publicly available? I can look into it to see if there is a way to make it compatible with my code. :)
Awesome! Here goes:
http://yann.lecun.com/exdb/mnist/
http://archive.ics.uci.edu/ml/machine-learning-databases/semeion/
I found these two
Sent from my iPhone
On Oct 21, 2017, at 3:05 AM, Grzegorz Opoka notifications@github.com wrote:
Ok, is this dataset publicly available? I can look into it to see if there is a way to make it compatible with my code. :)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
Unfortunatelly, those datasets represent numbers as images. For handwriting generation you would need to have list of consecutive points showing how a digit is written. So those datasets cannot be used here.
Would this one work? This has the stroke data: https://github.com/edwin-de-jong/mnist-digits-stroke-sequence-data/wiki/MNIST-digits-stroke-sequence-data
On Mon, Oct 23, 2017 at 2:36 PM, Grzegorz Opoka notifications@github.com wrote:
Unfortunatelly, those datasets represent numbers as images. For handwriting generation you would need to have list of consecutive points showing how a digit is written. So those datasets cannot be used here.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Grzego/handwriting-generation/issues/2#issuecomment-338804235, or mute the thread https://github.com/notifications/unsubscribe-auth/AEQOkpsMBSx4SjLVJftQ-gStOB7Yv2ZYks5svQb3gaJpZM4P-NV6 .
This one might work. :) Can you give some examples of sequences you want to generate? I just want to figure out what kind of augmentation to dataset might be needed.
about 5 digit random sequences. In example 11445 8013 1507 etc..
On Mon, Oct 23, 2017 at 4:30 PM, Grzegorz Opoka notifications@github.com wrote:
This one might work. :) Can you give some examples of sequences you want to generate? I just want to figure out what kind of augmentation to dataset might be needed.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Grzego/handwriting-generation/issues/2#issuecomment-338826058, or mute the thread https://github.com/notifications/unsubscribe-auth/AEQOkiB0tXseZLgH7Nry79NSXJcXQchlks5svSGRgaJpZM4P-NV6 .
Sorry for very late response. I tried this dataset and unfortunately it doesn't work well :/ The results are even worse than with original IAM dataset. If by any chance I find better dataset for this task I will post it here.
THANK YOU!!!!
On Wed, Nov 8, 2017 at 12:50 PM, Grzegorz Opoka notifications@github.com wrote:
Sorry for very late response. I tried this dataset and unfortunately it doesn't work well :/ The results are even worse than with original IAM dataset. If by any chance I find better dataset for this task I will post it here.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Grzego/handwriting-generation/issues/2#issuecomment-342955118, or mute the thread https://github.com/notifications/unsubscribe-auth/AEQOkiSt828fSdSpFVqBdRCh93u3PkbCks5s0hQkgaJpZM4P-NV6 .
Well it's been a while, but I was kind of interested in this problem and created MNIST handwriting dataset. If you still need to generate numbers you may find it useful. One simple solution is to just pick needed digits from this dataset and concatenate them together. :)
@Grzego THANK YOU!
This is probably outside the scope of the "issues" but figure I'd ask.
I notice it doesn't take numbers. Is there away to add numbers to the xml data sets so it can also do numbers?