kevinzg / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
2.29k stars 616 forks source link

It's possible to extract a list of members in a public group using get_group_info() ? #305

Closed jonatrios closed 3 years ago

jonatrios commented 3 years ago

Like in get_profile(), when loop through friends and fill the list with a dict that conteins 'link', 'name' and 'tagline'. Maybe the user id of the group and the text name.

e.g. <a class="oajrlxb2 g5ia77u1 qu0x051f esr5mh6w e9989ue4 r7d6kgcz rq0escxv nhd2j8a9 nc684nl6 p7hjln8o kvgmc6g5 cxmmr5t8 oygrvhab hcukyx3x jb3vyjys rz4wbd8a qt6c0cv9 a8nywdso i1ao9s8h esuyzwwr f1sip0of lzcic4wl oo9gr5id gpro0wi8 lrazzd5p" href="/groups/708950959251121/user/1203711361/" role="link" tabindex="0">Bruno Cardinale

neon-ninja commented 3 years ago

Currently, get_group_info will extract the admins of a group, however, viewing admins/members requires cookies:

pprint.pprint(get_group_info("708950959251121", cookies="cookies.txt"))

outputs

{'admins': [{'link': '/carlos.diuk?refid=18', 'name': 'Carlos Greg Diuk'},
            {'link': '/emislej?refid=18', 'name': 'Ernesto Mislej'},
            {'link': '/manuelaristaran?refid=18', 'name': 'Manuel Aristarán'},
            {'link': '/ideas.rapidas?refid=18', 'name': 'Pablo Zivic'}],
 'id': '708950959251121',
 'members': 9985,
 'name': 'Data Science Argentina',
 'type': 'Public group'}

Are you saying you want get_group_info to extract non-admin members too?

jonatrios commented 3 years ago

Yes!, non-admin members, that is what i was meant to say, sorry!

neon-ninja commented 3 years ago

This commit (https://github.com/kevinzg/facebook-scraper/commit/7cf5030b73550e695e79d715892b7d33be7d8382) should do it, however it seems m.facebook.com only lets you see up to 116 public group members. Now that I think about, I think that's why I started with just admins when I was initially developing get_group_info. Same problem on mbasic.facebook.com.

pprint.pprint(get_group_info("708950959251121", cookies="cookies.txt"))

Outputs

{'admins': [{'link': '/carlos.diuk', 'name': 'Carlos Greg Diuk'},
            {'link': '/emislej', 'name': 'Ernesto Mislej'},
            {'link': '/manuelaristaran', 'name': 'Manuel Aristarán'},
            {'link': '/ideas.rapidas', 'name': 'Pablo Zivic'}],
 'id': '708950959251121',
 'members': [{'link': '/falkonery.rivera', 'name': "'Rivera S' Falkonery"},
             {'link': '/diego.amez1', 'name': 'A Diego Amez Mendes'},
             {'link': '/eerrol.casttill', 'name': 'A Eerrol Casttill'},
             {'link': '/profile.php?id=100053879061832',
              'name': 'AIS - Aplicaciones de Inteligencia Artificial'},
             {'link': '/alberto.melendez.5477', 'name': 'ALberto Meléndez'},
             {'link': '/danaa.chavez19', 'name': 'ALdana Chavez'},
             {'link': '/ale.cruz.98096', 'name': 'ALe Cruz'},
             {'link': '/ce.anita', 'name': 'ANita Morales'},
             {'link': '/araliibeth.reymundogarcia',
              'name': 'ARali Reymundo Garcia'},
             {'link': '/aaron.benchiheubperez',
              'name': 'Aaron Benchiheub Pérez'},
             {'link': '/aaron.vargas1', 'name': 'Aaron Indiecito'},
             {'link': '/aaron.yehg', 'name': 'Aaron Yeh'},
             {'link': '/aaronsetillo', 'name': 'Aaronseti Algo'},
             {'link': '/AaronSk80rdie', 'name': 'Aarón Roverano'},
             {'link': '/aashika.mahajan.7', 'name': 'Aashika Mahajan'},
             {'link': '/sad.saim', 'name': 'Abdul Samad Afridi'},
             {'link': '/abxda', 'name': 'Abel Coronado Iruegas'},
             {'link': '/abel.labrana', 'name': 'Abel Labraña'},
             {'link': '/abel.limachi.7', 'name': 'Abel Limachi'},
             {'link': '/abel.sanchezbechur', 'name': 'Abel Sánchez Bechur'},
             {'link': '/abelardo1206', 'name': 'Abelardo Lugo'},
             {'link': '/profile.php?id=100004762893950',
              'name': 'Abhishek R Gupta'},
             {'link': '/profile.php?id=100008192941110',
              'name': 'Abigail Geronimo'},
             {'link': '/abigail.grimberg', 'name': 'Abigail Grimberg'},
             {'link': '/profile.php?id=100006536883728',
              'name': 'Abraham Borda'},
             {'link': '/profile.php?id=100008820702314', 'name': 'Abraham Dev'},
             {'link': '/abraham.vila', 'name': 'Abraham Vila'},
             {'link': '/Robintux', 'name': 'Abraham Zamudio'},
             {'link': '/abril.ibarra.98837', 'name': 'Abril Ibarra'},
             {'link': '/AcelgaTenues', 'name': 'Acelga Tenues'},
             {'link': '/clasesactuariales/', 'name': 'Actuariales'},
             {'link': '/adha.martinezissa', 'name': 'Ada Martinez Issa'},
             {'link': '/ada.zamora.56', 'name': 'Ada Zamora'},
             {'link': '/AdamGardner123', 'name': 'Adam Gardner'},
             {'link': '/adhemar.gianini', 'name': 'Adhemar Gianini'},
             {'link': '/adiel.tapari', 'name': 'Adiel Tapari'},
             {'link': '/alsnwi', 'name': 'Adnan Alsnwi'},
             {'link': '/AnonimoXCat', 'name': 'Adolf Patriarca'},
             {'link': '/aobexx', 'name': 'Adolfo Antonio'},
             {'link': '/adonai.sala', 'name': 'Adonai Sala'},
             {'link': '/adrian.arellanosangama',
              'name': 'Adrian Arellano Sangama'},
             {'link': '/adrian.fazio.3', 'name': 'Adrian Armando Fazio'},
             {'link': '/Arturito.arispe.torrez',
              'name': 'Adrian Arturo Arispe Torrez'},
             {'link': '/adrian.cafa', 'name': 'Adrian Cafa'},
             {'link': '/adriancapchaq', 'name': 'Adrian Capcha Quispe'},
             {'link': '/adrian.castiglione.54', 'name': 'Adrian Castiglione'},
             {'link': '/adrian.dalfonso.56', 'name': "Adrian D'Alfonso"},
             {'link': '/Adrz93', 'name': 'Adrian David Rojas'},
             {'link': '/adrian.d.ruggiero', 'name': 'Adrian Di Ruggiero'},
             {'link': '/adrianfz13', 'name': 'Adrian Fernandez'},
             {'link': '/adrian.iramain.3', 'name': 'Adrian Iramain'},
             {'link': '/victoradrianjimenez', 'name': 'Adrian Jimenez'},
             {'link': '/adri.4n.marin0', 'name': 'Adrian Marino'},
             {'link': '/adrian.micelotta', 'name': 'Adrian Micelotta'},
             {'link': '/adrianlnc', 'name': 'Adrian Nieto Castillo'},
             {'link': '/adripcj', 'name': 'Adrian Parisi'},
             {'link': '/ap988', 'name': 'Adrian Pellegrino'},
             {'link': '/adrian.quiroga.18', 'name': 'Adrian Quiroga'},
             {'link': '/adrian.rodas.5', 'name': 'Adrian Rodas'},
             {'link': '/adrian.romero.3538', 'name': 'Adrian Romero'},
             {'link': '/adrianalbertosantillan', 'name': 'Adrian Santillan'},
             {'link': '/adrian.spadavecchia.7', 'name': 'Adrian Spadavecchia'},
             {'link': '/adrian.tolosa.18', 'name': 'Adrian Tolosa'},
             {'link': '/profile.php?id=1577151118',
              'name': 'Adriana A. Cuenca'},
             {'link': '/adriana.gabriela.520', 'name': 'Adriana Gabriela'},
             {'link': '/hurtadri', 'name': 'Adriana Hurtado'},
             {'link': '/adriana.iannicelli.5', 'name': 'Adriana Iannicelli'},
             {'link': '/2acruz', 'name': 'Adriana Impeesa'},
             {'link': '/adriana.madeleine.9', 'name': 'Adriana Madeleine'},
             {'link': '/persefoneAM', 'name': 'Adriana Maulini'},
             {'link': '/adriana.orozcopereira',
              'name': 'Adriana Orozco Pereira'},
             {'link': '/adrianaalicia.perez.5', 'name': 'Adriana Pérez'},
             {'link': '/adry.rg.5', 'name': 'Adriana RG'},
             {'link': '/sol992', 'name': 'Adriana Soledad'},
             {'link': '/adriana.verduga', 'name': 'Adriana Verduga'},
             {'link': '/adriano.thebridge', 'name': 'Adriano TheBridge'},
             {'link': '/adriano.villella', 'name': 'Adriano Ville'},
             {'link': '/acastioni', 'name': 'Adrián Castioni'},
             {'link': '/adriancastroarq', 'name': 'Adrián Castro'},
             {'link': '/adrian.chamudis', 'name': 'Adrián Chamudis'},
             {'link': '/adrian.dipaolo', 'name': 'Adrián Di Paolo'},
             {'link': '/agfreisinger', 'name': 'Adrián Freisinger'},
             {'link': '/AdrianGonzalesM', 'name': 'Adrián Gonzales'},
             {'link': '/adrian00.kim', 'name': 'Adrián Kim'},
             {'link': '/adrianjleon', 'name': 'Adrián León Guaidó'},
             {'link': '/hori.pardo.83', 'name': 'Adrián Pardo'},
             {'link': '/AdrianPino22', 'name': 'Adrián Pino'},
             {'link': '/profile.php?id=1277642118', 'name': 'Adrián Rodriguez'},
             {'link': '/Adrian.M.Urrutia', 'name': 'Adrián Urrutia'},
             {'link': '/adrian.velazquez.1420', 'name': 'Adrián Velázquez'},
             {'link': '/avinocur85', 'name': 'Adrián Vinocur'},
             {'link': '/afra.blundetto.961', 'name': 'Afra Blundetto'},
             {'link': '/Agop-Karagoz-694210354362279/', 'name': 'Agop Karagoz'},
             {'link': '/agos.cejas.92', 'name': 'Agos Cejas'},
             {'link': '/agos.marquez.90', 'name': 'Agos Marquez'},
             {'link': '/agoos.segoviia', 'name': 'Agos Segovia'},
             {'link': '/tiny.coniglio', 'name': 'Agostina Coniglio'},
             {'link': '/agostina.larrazabal', 'name': 'Agostina Larrazabal'},
             {'link': '/profile.php?id=100016741746613',
              'name': 'Agredisto Armando Lio'},
             {'link': '/profile.php?id=100009385505882',
              'name': 'Agudelo Sebas'},
             {'link': '/AgusAnna97', 'name': 'Agus Annacondia'},
             {'link': '/agustina.aragon.752', 'name': 'Agus Aragón'},
             {'link': '/aecipriano', 'name': 'Agus Cipriano'},
             {'link': '/agus.colantonio', 'name': 'Agus Colantonio'},
             {'link': '/agus.correa.79', 'name': 'Agus Correa'},
             {'link': '/adamiani', 'name': 'Agus Damiani'},
             {'link': '/agus.gatica.10', 'name': 'Agus Gatica'},
             {'link': '/CeliLandolt', 'name': 'Agus Gomez'},
             {'link': '/agus.magnoni', 'name': 'Agus Magnoni'},
             {'link': '/agusmanauta', 'name': 'Agus Manauta'},
             {'link': '/agus.matkovac', 'name': 'Agus Matkovac'},
             {'link': '/amesiacrawley', 'name': 'Agus Mesía Crawley'},
             {'link': '/agus.montiel', 'name': 'Agus Montiel'},
             {'link': '/aguusnm', 'name': 'Agus Nuñez'},
             {'link': '/agus.pantaleon.5', 'name': 'Agus Pantaleon'},
             {'link': '/agus.pugliese.1', 'name': 'Agus Pugliese'}],
 'name': 'Data Science Argentina',
 'type': 'Public group'}
jonatrios commented 3 years ago

Ok!, that works for me. Thanks a lot!, considered this as a close issue