Closed InternetMaster1 closed 4 years ago
Updated Readme. You may find links to the pre-trained models there. But, please, bear in mind that these models were trained only for the task of human segmentation using Pixart and Supervisely Person Datasets.
Thank you very much for the super-quick answer and for providing the pre-trained models.
High-Accuracy Human segmentation is exactly what I am looking for!
A couple more questions :
1) What is the license of this library? Can it be used for commercial purpose?
2) I have the Supervisely, but am unaware of Pixart dataset. Is it possible to provide a link for the same?
3) In the final output mask, how can I even get the objects that a person is holding, say a cup, a purse, a tennis racquet, a toy, a magazine. It could be just about anything.
I am very much perplexed with this problem.
If I am not mistaken, the supervisely dataset doesn't contain masks for objects that the person might be holding. To achieve this, would a dataset like Supervisely be unfit for the job? Or we need to train on a dataset with more labels than just "person"?
But ideally, if an object is lying on the side, it is ok if it does not come in the mask. But if the person is holding the object, it should definitely come in the final mask.
How can this be achieved?
@voeykovroman
I tried running the pre-trained tflite file on https://github.com/tensorflow/examples/tree/master/lite/examples/image_segmentation
Its giving the following error :
Something went wrong: Cannot convert between a TensorFlowLite buffer with 602112 bytes and a Java Buffer with 3000000 bytes.
I tried the solution mentioned in this issue but to no avail.
The formula is correct?
ByteBuffer.allocateDirect(1 imageSize imageSize NUM_CLASSES 4)
Thank you Roman for the detailed answer.
@voeykovroman
Just two more questions. Thank you for your patience :)
Something went wrong: Cannot convert between a TensorFlowLite buffer with 602112 bytes and a Java Buffer with 3000000 bytes.
I am lost in the sea of so many libraries for semantic segmentation. For mobile usage, but for highest accuracy & mask quality (rather than fastest), what would be a good option?
MobileNetV2, MobileNetv3, BiseNet, or something else? I am even encountering libraries such as PortraitNet, SINet/ExtremeC3Net, etc... I am very confused...
Could you please point me in the right direction?
Sorry for a such late response, but only now have time to return to the repo.
Would it be possible to provide a pre-trained model for quick evaluation?
Many thanks in advance!