Open rsehgal opened 5 years ago
2D-Visualization for POCA Raw_data dataset and POCA Filtered_dataset
3D-VIsualization for POCA Raw_data dataset and POCA Filtered_dataset
Thanks Aniket, Keep on doing the good job.
Cheers,
Also try to get the cluster out of filtered data, and try to find the dimension of cluster in 3D.
Issue has arose while performing dbscan.What shall I do about negative values or rather -ve co-ords??
https://scikit-learn.org/stable/modules/outlier_detection.html
Scikit Outlier Detection Methods
https://blog.floydhub.com/introduction-to-anomaly-detection-in-python/
Really useful article
Anomoly detection is very good concept. We should certainly spend some time on it,
https://blog.floydhub.com/introduction-to-anomaly-detection-in-python/
Really useful article
Yes I have added this article earlier I guess
On Mon, Jun 10, 2019, 4:09 PM rsehgal notifications@github.com wrote:
Anomoly detection is very good concept. We should certainly spend some time on it,
https://blog.floydhub.com/introduction-to-anomaly-detection-in-python/
Really useful article
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/rsehgal/TomoML/issues/1?email_source=notifications&email_token=AKGDXGT2SZLTHMH4ABP6YH3PZYVN3A5CNFSM4HVAD4H2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXJRCIA#issuecomment-500371744, or mute the thread https://github.com/notifications/unsubscribe-auth/AKGDXGV2FJI4IZ6XP6PHXM3PZYVN3ANCNFSM4HVAD4HQ .
Have a look at ijca_survey_paper.pdf Please compare various Proximity base Techniques likes kNearest Neighbour kMeans DBSCAN IsolationForest
With respect to outliers or clusters??
On Tue, Jun 18, 2019, 10:12 AM rsehgal notifications@github.com wrote:
Please compare the DBSCAN and IsolationForest and few more
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/rsehgal/TomoML/issues/1?email_source=notifications&email_token=AKGDXGRILBNDQMRNUB6LBODP3BRRXA5CNFSM4HVAD4H2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODX5FDXQ#issuecomment-502944222, or mute the thread https://github.com/notifications/unsubscribe-auth/AKGDXGQJHJFUWSC75DFJWDDP3BRRXANCNFSM4HVAD4HQ .
Got it . Outliers
On Tue, Jun 18, 2019, 10:46 AM Aniket Shinde aniketshinde12@gmail.com wrote:
With respect to outliers or clusters??
On Tue, Jun 18, 2019, 10:12 AM rsehgal notifications@github.com wrote:
Please compare the DBSCAN and IsolationForest and few more
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/rsehgal/TomoML/issues/1?email_source=notifications&email_token=AKGDXGRILBNDQMRNUB6LBODP3BRRXA5CNFSM4HVAD4H2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODX5FDXQ#issuecomment-502944222, or mute the thread https://github.com/notifications/unsubscribe-auth/AKGDXGQJHJFUWSC75DFJWDDP3BRRXANCNFSM4HVAD4HQ .
Another good paper that may be helpful to us. Bandieramonte_2015_J._Phys.__Conf._Ser._608_012046.pdf
Please find the Distance Of Closest Approach (DoCA) Histogram. Try to reproduce it using python
One can also explore Gaussian mixture model (GMM)
Yes sir read an article about it yesterday https://towardsdatascience.com/wondering-how-to-build-an-anomaly-detection-model-87d28e50309
Sir this is the visualization b/w x-y,y-z,x-z and x-doca.Can you suggest me some more visualization in order to select right attributes??
KMEANS OUTPUT BEAUTIFULLY DIFFERENTIATION OBSERVED
Very nice Aniket. this is what we i wanted !!!!!!!!
Can we have similar results in 3D
You saw??
On Mon, Jun 24, 2019, 5:16 PM rsehgal notifications@github.com wrote:
Very nice Aniket. this is what we i wanted !!!!!!!!
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/rsehgal/TomoML/issues/1?email_source=notifications&email_token=AKGDXGWKLNUCJEA3ZI64HHTP4CX2ZA5CNFSM4HVAD4H2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYMU7YA#issuecomment-504975328, or mute the thread https://github.com/notifications/unsubscribe-auth/AKGDXGU4FI77LEZZU74ESILP4CX2ZANCNFSM4HVAD4HQ .
Yes i saw,
Hi Aniket, Can we also get the mean and standard deviation of scattering angle of points in individual cluster. Like in this case we should get 4 means and 4 standard deviation values for 4 clusters
Doca vs log(DoCA) Graph
SIze(RAnge) of cluster vs LOg(DOca). Have a look
Added Sorted Centroid(distances) vs the Size(range) of cluster
Not good outlier removal after using scattering Angle HAve a look Sir
Did you plot only those points which comes under 2sigma in the histogram of scattering angle.? Can we also have the plot of scattering angle historgram for this cluster ?
One good thing you that you can clearly see from your fourth plot. If you don't consider the outlier then in the X axis the cluster varies from 50 to 250 which implies the length of side of scatterer block along X axis is (250-50 = 200) which is exact value what i am using in the simulations. Similar result can be seen from Y axis.
Good, Keep on doing good job
you can check the code
On Wed, Jun 26, 2019, 9:46 AM rsehgal notifications@github.com wrote:
One good thing you that you can clearly see from your fourth plot. If you don't consider the outlier then in the X axis the cluster varies from 50 to 250 which implies the length of side of scatterer block along X axis is (250-50 = 200) which is exact value what i am using in the simulations. Similar result can be seen from Y axis.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/rsehgal/TomoML/issues/1?email_source=notifications&email_token=AKGDXGU2NOAR67F3JISXJFLP4LUSTA5CNFSM4HVAD4H2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYSITQA#issuecomment-505711040, or mute the thread https://github.com/notifications/unsubscribe-auth/AKGDXGVITLM7BN4WWJNIZ73P4LUSTANCNFSM4HVAD4HQ .
That's not an improved result.At the time of clustering itself, it was showing such behaviour.You can observe the cluster graph above and see.
OK
That's not an improved result.At the time of clustering itself, it was showing such behaviour.You can observe the cluster graph above and see.
SIr can you send the actual package name or the command??
I want Pillow package
According to me KMeans won't give us proper results,because it includes outliers while calculating centroid thus leading to deviation from ideal behaviour.It is highly sensitive to outliers. After selecting a cluster and removing its outliers it becomes difficult to remove them since cluster centers are deviated themselves thus making their removal difficult.So we now move to k-medoids,k-medians since medians/medoids are less sensitive to outliers
KMedians Output based on Mean_Scat_ANGLe
KMedians,Kmeans ou
tput in three-dimensional.
HI Aniket, Please assign color using only upto second place of decimal of mean of scattering angle
I am generating the data using different material, Once its ready then we will run you clustering on this data also, and see if we can get different color for different material.
Mean Scattering Angle is itself of order 10^-3.Outputs for precision 2 and precision 3 are attached.
Hi Aniket, Please find the file attached in csv and space separated format filteredDiffMaterial.txt
CSVfilteredDiffMaterial.txt CSVrawDiffMaterial.txt
Github was not allowing to upload file with csv extenstion. So file that Started with CSV are actually csv file, you can just rename it to .csv
Yes sir here are the final plots :1)With a precision of 2 decimal places
2)Without precision
Can you please write the median and of each cluster along with its mean scattering
Final Medians X Y Z Scat_Angle doCA Mean_ScattERING_ANgle -142.0755, 102.2153, 12.135200000000001, -0.040366360000000004, 1.9149356499999999 0.0008188780047930777 142.0355, 158.93189999999998, -94.5895, -0.021385835, 0.44684695 0.0012874675448983908
-140.204, -140.766, -5.22398, 0.000646646, 0.130713 0.0004999243537895381 140.84750000000003, -159.72415, -90.7157, -0.0004726000000000001, 0.24103750000000002 0.0016299095004796982
you can compare the results of clusters of K-Means with K-Medians by looking at the files in dataset folder
Any significance difference between KMean and KMedians ??
Results are not bad, Only first row is not as expected, but this may be due to less events. So may we should repeat this test with atleast double the number of events -142.0755, 102.2153, 0.0008188780047930777(Pb) 142.0355, 158.93189999999998, 0.0012874675448983908 (Fe) -140.204, -140.766, 0.0004999243537895381(Al) 140.847500, -159.72415, 0.0016299095004796982(Pb)
in the other three row. Al got the least scattering, then Fe and max is for Pb.
=========
Good JOb
For one cluster x and y axis have -145 -146 for kmedians and in kmeans it is -145 -142 for another cluster whereas for another cluster it is -145 134 for kmedians and -145 142 for kmeans
Lets discuss on Monday.
Here the target to process the point cloud generated after simulation, The first obvious step would be to visualize th point in 2D and 3D. and make write some outlier detection algorithm to remove the outlier and finally try to find clusters.