Closed ColmMassey closed 4 years ago
We need to define how to map 2 field in Youth data with our schema.
For this iteration, let's just to Organisational Structure See https://vocabs.solidarityeconomy.coop/essglobal/V2a/html-content/essglobal.html#V2a
We can make the following straight forward mapping from Type to Organisational Structure
Organisational Structure | |||
---|---|---|---|
Cooperativa de consumo / usuario final | -> | Consumer co-operative | OS80 |
Coopérative de consommateur.rice.s | |||
Final consumer/user cooperative | |||
Cooperativa de múltiples actores | -> | Multi-stakeholder co-operative | OS100 |
Coopérative pluri-acteurs | |||
Multi-stakeholder cooperative | |||
Cooperativa de producción | -> | Producer co-operative | OS90 |
Coopérative de producteur.rice.s (dont agricole) | |||
Producer cooperative | |||
Cooperativa de trabajo y empleo | -> | Self Employed | OS150 |
Cooperativa di lavoro | |||
Work and employment cooperative | |||
Coopérative de travailleur.se.s | -> | Workers co-operative | OS60 |
Hi @ColmMassey Initial version deployed at: https://data1.solidarityeconomy.coop/ica-youth-network/
they're all only coops for now (organisational structure) We need to finish the mapping and implement it
Btw @ColmMassey How did you extract the youth data for the newest file in the next cloud the csv file is really messy (i.e. random quotation marks in places, missing field, names in lat/lng fields)
it has some generic problems with the rows as well
i.e.
these pretty much don't seem to fit, there are two random commas inserted
Organization Type,Name,name,Region,Country,City,Latitude,Longitude,Size,Type,Sector,Address,Description,Additional Details,Website,Email,,
Youth-led Co-ops,Bukavu Youth Agripreneurs (BYA),Agriculture,Africa,Democratic Republic of Congo,Bukavu,-2.5123017,28.8480284,0-5,Producer cooperative,Agriculture,"Av. Du Plateau No45 A, 3�me niveau", Q. Nguba," C. Ibanda/Bukavu-South Kivu, Democratic Republic of Congo (DRC)","AgriTech is an organization whose central objective is to respond, in quantity and quality, to the needs of rural and urban people in food production, for sustainable improvement to their health and financial situation. AgriTech provides training to youth in IT and agriculture because we believe that youth are the next generation to make this world a safer and better place to live.",,www.agritech.online,info@agritech.online
or this line
Youth-led Co-ops,Campus Credit Multi-Purpose Cooperative Society,Banking / credit unions,Africa,Nigeria,Abraka,5.7894321,6.1023468,6-20,Final consumer/user cooperative,Banking / credit unions,"Post Graduate Class (Campus 1) Institute of Education, Delta State University, P .M. B 1, Abraka Delta State, Nigeria","CampusCredit Cooperative Society is a student-owned/driven consumer cooperative which originated from an idea by a group of post-graduate students in the Institute of Education, Delta State University, Abraka. The goal is to engineer trade systems on campuses that will enhance students financial well-being which in turn improves students academic performance.",Our strategy is simple: Harness students purchasing power through benefits associated with economies of scale", drive financial inclusion through promotion of the cooperative business enterprise," and enhance financial literacy amongst students through research-driven fInancial lIteracy counseling programs.,www.campuscredit.coop,team@campuscredit.coop
with a bunch of random quotation marks in the middle
Should we fix this here or create a new issue? The old file (the one uploaded 21 days ago) does not seem to have these issues, i have currently uploaded that one as LOD
we can either clean the data or just throw away bad entries
OR we can allow the LOD data to have small errors (e.g. some string in the lat lng fields) and make sure we account for that
updated with the new data and did the mapping for organisational structure at: https://dev.ica-youth-network.solidarityeconomy.coop/
I am doing 'some' cleaning on the data. Basically I am removing the first and last " symbols that surround each row then I am removing each random " in between each "" which is the actual delimiter then some other ' symbols
I also have some code for fixing the errors above and placing the " in the right places, but that messes up the good entries. Data as it is is alright and errors raised from bad fields are accounted for and negated.
Would you like me to clean up the data fully or just leave it as it is?
This is the url to what seems to be the raw data? https://docs.google.com/spreadsheets/d/e/2PACX-1vRBT9x3W7Cw-7EEZczfTExYNrrO6yFfe7drhXiHTsRkSg7q2TR3r902ybpcOikqZ5-YCqz2T04wo4qU/pub?gid=573145819&single=true&output=csv
Would you like me to clean up the data fully or just leave it as it is?
We shouldn't be doing any cleaning that can't be automatic when loading new versions, but better to get cleaner raw data. Is it cleaner when you pull from thsi google doc? There are two data sets, but let's just work on this one now, the youth led co-ops.
it has some generic problems with the rows as well
I don't know what the issue was with the first data I downloaded, but the stuff I get now using that url looks clean. Do you concur @dtmakm27 ?
It is much better than the last, thanks for the fast response (saved me the effort of having to clean the previous one :) )
It also has minor problems but these will be accounted for when automatically generating data (i.e. additional commas which mess up the fields and random text in email/website fields). It should be alright if we receive data in this state (the link you provided)
New data is published: https://data1.solidarityeconomy.coop/ica-youth-network/index.html Map is also updated with the new data: https://dev.ica-youth-network.solidarityeconomy.coop/
The only thing left I think is that the address is one big string and is not separated in different segments. Should we do that or leave it as it is? @ColmMassey
The only thing left I think is that the address is one big string and is not separated in different segments. Should we do that or leave it as it is? @ColmMassey
Let's leave the address as is for now. How is it decided what fields are listed in the dialog? For example in Oxford, the description is listed, but not in the ICA Youth data?
Ah, That is the sparql query The email is not showing either. Be right back I'll do it now
Note, I've created an Issue for generating a Sameas list linking dotcoop & Yout ICA. https://github.com/SolidarityEconomyAssociation/open-data/issues/15
https://dev.ica-youth-network.solidarityeconomy.coop/
added email and description note: in the original csv file we have a field called description and a field called additional description. I am just appending additional description to description and putting them into one field
added email and description Let's leave out email for now as several of them are individuals. We would need permission to publish.
just appending additional description to description and putting them into one field
Makes sense.
Do you want to add more todo here or is it ready for review?
Once the email is dropped, put in For Review.
Ah wait, so I should remove the email?
dropped the email https://dev.ica-youth-network.solidarityeconomy.coop/
@ColmMassey @dtmakm27
I notice the ICA youth data has two "name" fields, presumably a mistake because the second looks like some other sort of data.
The first line of the data, for example, makes me think it is a duplicate of "Type". As such I think we can ignore it, but it might be worth mentioning to ICA so they can correct it (and add any other field they may have intended.) Also, see my next comment.
Another point: identifiers are missing from this data.
We get by in the demo by inserting our own. Ours are just an incrementing integer, added to initiatives in the order they are seen in this file. However, that won't work in general, and trying to track our own index of IDs would be a headache which is probably entirely avoidable.
Can we ask them to provide the member ID or some other unique identifier?
Can we ask them to provide the member ID or some other unique identifier?
I have made enquiries.
There is someone currenlty cleaning up all the Youth Co-op data to be aligned with the regular ICA data format, so let's leave this Issues closed until they come back with the new format in July.
Create required scripts to process and publish a snapshot of the ICA Youth Network's member data on our dev server.
See: here for background.
Use project name ica-youth-network