Closed murrayds closed 4 years ago
I try node2vec with splitting discipline, but the effect of discipline is very low I think.
This figure is colored by field, and This figure is colored by coutry.
But, on a national scale, there is some pattern. This is the case in the USA.
I think there are two possibilities that embedding is not clustered at the international level
Disciplines are so broad to capture the difference In this case, we can divide persona(discipline) based on co-occurrence analysis
There might be an imbalance between domestic trajectories and international trajectories in mobility data. (in terms of amount)
Amazing! Thanks for working this up!
I think there are two possibilities that embedding is not clustered at the international level
1. Disciplines are so broad to capture the difference In this case, we can divide persona(discipline) based on co-occurrence analysis 2. There might be an imbalance between domestic trajectories and international trajectories in mobility data. (in terms of amount)
It's probably a mix of both. I can try to get more fine-grained disciplinary classifications, but they become less interpretable when we get more granular.
However, I think that point 2 is the major factor. International mobility is rare compared to the total number of scholars (I will calculate this as a percentage of our data). I think that we can still make use of persona2vec though, possibly by:
Lets think on it and maybe bring it up during team meeting?
Lets think on it and maybe bring it up during team meeting?
You mean tomorrow?
Lets think on it and maybe bring it up during team meeting?
You mean tomorrow?
Sure, maybe we can meet after the SG meeting and you can run me through what these results mean again? And we can discuss next steps.
I try to use city_to_region for the USA data set, but there are nan values in the data set. Do you want to fill this out manually? @murrayds
I try to use city_to_region for the USA data set, but there are nan values in the data set. Do you want to fill this out manually? @murrayds
I will fill out some missing city names and update you when I am done. I am also working on getting regional-scale data now for every organization, even outside of the United States.
I try to apply gravity rule on disciple split embedding. In 2008-2019_nonmobile, there is no information about the fields of the researcher. Can you give me the fields of non_mobile researchers? @murrayds
Here are results from split embedding.
I try to apply gravity rule on disciple split embedding. In 2008-2019_nonmobile, there is no information about the fields of the researcher. Can you give me the fields of non_mobile researchers? @murrayds
This data can be found in the file SME-dropbox/Data/Raw/nonmobile_researcher_trajectories.txt
Meanwhile, the trajectories for mobile researchers are now stored in SME-dropbox/Data/Raw/mobile_researcher_trajectories.txt
Here are results from split embedding.
Nice! Irs good to know that the performance persists in the discipline split data. I'm making a presentation for Wednesday and I'll throw this in.
Thanks a lot. I use M_i terms as a fraction of the original institutes size, I will change to this value. Thank you!
A la persona2vec.