hammerheaddf / privacy-scraper

Baixa conteúdo do privacy.com.br
12 stars 5 forks source link

Encontra as postagens e não as mídias. #2

Closed ramas-jpg closed 7 months ago

ramas-jpg commented 9 months ago

C:\pvc>python main.py github Abrindo página de login... Aguardando autenticação... Procurando página de postagens do perfil github... Buscando postagens com mídia... Post 6726d365-2b67-42d8-8a2d-45737aa424c1: 100%|██████████████████████████████████████████| 340/340 [00:23, 14.73it/s] 325 postagens com texto e mídia, 0 mídias encontradas. Baixando 0 mídias. Sem mídia para baixar. Encerrado.

Abre o chromium em modo anônimo, abre a pagina de login do privacy, adiciona o login e senha e abre o perfil escolhido, mas encontra as postagens e não baixa nenhuma midia, em tese esse perfil em específico tem 340 postagens e 3.771 mídias.

Vi que você está sem tempo para aplicar uma possível correção, então fico no aguardo de uma possível correção se algum dia puder!

ramas-jpg commented 8 months ago

Apenas passando para mandar um feedback do update do main.py,

C:\pvc>python main.py ####### Abrindo página de login... Aguardando autenticação... Procurando página de postagens do perfil #######... Buscando postagens com mídia... Post ########-####-####-####-############: 100%|██████████████████████████████████████████| 345/345 [00:19, 17.35it/s] 330 postagens com texto e mídia, 0 mídias encontradas. Baixando 0 mídias. Sem mídia para baixar. Encerrado.

ishimarumakoto commented 7 months ago

Olá. estou interessado em ajudar no projeto mas não consigo nem passar da primeira parte: PS C:\Users\Downloads\privacy-scraper-playwright> python .\main.py Traceback (most recent call last): File "C:\Users\Downloads\privacy-scraper-playwright\main.py", line 12, in import metadata as meta File "C:\Users\Downloads\privacy-scraper-playwright\metadata.py", line 8, in from sqlalchemy.orm import mapped_column ImportError: cannot import name 'mapped_column' from 'sqlalchemy.orm' (C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\orm__init__.py)

hammerheaddf commented 7 months ago

Olá. Me desculpe por não ter colocado as instruções no GitHub. Primeiro vc precisa instalar as dependências.

python -m pip install -r requirements.txt

Depois, precisa passar o nome do perfil como parâmetro.

python main.py

Não sou expert em Python, mas se tiver mais dúvidas, manda aí.

Em ter., 12 de mar. de 2024, 09:05, Marco Antonio Velloni Figueiredo Junior @.***> escreveu:

Olá. estou interessado em ajudar no projeto mas não consigo nem passar da primeira parte: PS C:\Users\Downloads\privacy-scraper-playwright> python .\main.py Traceback (most recent call last): File "C:\Users\Downloads\privacy-scraper-playwright\main.py", line 12, in import metadata as meta File "C:\Users\Downloads\privacy-scraper-playwright\metadata.py", line 8, in from sqlalchemy.orm import mapped_column ImportError: cannot import name 'mappedcolumn' from 'sqlalchemy.orm' (C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\orm init_.py)

— Reply to this email directly, view it on GitHub https://github.com/hammerheaddf/privacy-scraper/issues/2#issuecomment-1991497939, or unsubscribe https://github.com/notifications/unsubscribe-auth/AODR6WMEX4Y4SVPITBHCWFTYX3VO5AVCNFSM6AAAAABCYKWRK2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJRGQ4TOOJTHE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

ishimarumakoto commented 7 months ago

opa, valeu pela resposta meu caro. Já tinha feito o

python -m pip install -r requirements.txt

Mas quando estou fazendo

python main.py meu perfil pessoal

não tá me dando nada além da mensagem de erro:

Traceback (most recent call last): File "C:\Users\Downloads\privacy-scraper-playwright\main.py", line 12, in import metadata as meta File "C:\Users\Downloads\privacy-scraper-playwright\metadata.py", line 8, in from sqlalchemy.orm import mapped_column ImportError: cannot import name 'mapped_column' from 'sqlalchemy.orm' (C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\sqlalchemy\orm__init__.py)

EDIT: resolvi. Meu pip install estava instalando a versão errada editei a linha para sqlalchemy==2.0.28 e não tenho mais erro

ishimarumakoto commented 7 months ago

Desculpa estar encher o saco, mas quero pelo menos conseguir rodar pra ver se consigo ajudar ahhaha

Estou tendo esse problema agora:

Traceback (most recent call last):
  File "C:\Users\Downloads\privacy-scraper-playwright\main.py", line 438, in <module>
    asyncio.run(main())
                ^^^^^^
  File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\asyncclick\core.py", line 1205, in __call__
    return anyio.run(self._main, main, args, kwargs, **opts)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\anyio\_core\_eventloop.py", line 73, in run
    return async_backend.run(func, args, {}, backend_options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\anyio\_backends\_asyncio.py", line 2001, in run
    return runner.run(wrapper())
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\asyncio\runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\asyncio\base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\anyio\_backends\_asyncio.py", line 1989, in wrapper
    return await func(*args)
           ^^^^^^^^^^^^^^^^^
  File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\asyncclick\core.py", line 1208, in _main
    return await main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\asyncclick\core.py", line 1120, in main
    rv = await self.invoke(ctx)
         ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\asyncclick\core.py", line 1485, in invoke
    return await ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\asyncclick\core.py", line 824, in invoke
    rv = await rv
         ^^^^^^^^
  File "C:\Users\Downloads\privacy-scraper-playwright\main.py", line 369, in main
    await user.type(settings.user)
                    ^^^^^^^^^^^^^
  File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\dynaconf\base.py", line 145, in __getattr__
    value = getattr(self._wrapped, name)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\dynaconf\base.py", line 328, in __getattribute__
    return super().__getattribute__(name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'Settings' object has no attribute 'USER'
hammerheaddf commented 7 months ago

Relaxa, eu é que não fiz as instruções. kkkkkkkkk Vc precisa colocar suas credenciais no arquivo .secrets.yml, nesse formato: user: pwd:

Em ter., 12 de mar. de 2024, 11:18, Marco Antonio Velloni Figueiredo Junior @.***> escreveu:

Desculpa estar encher o saco, mas quero pelo menos conseguir rodar pra ver se consigo ajudar ahhaha

Estou tendo esse problema agora:

Traceback (most recent call last): File "C:\Users\Downloads\privacy-scraper-playwright\main.py", line 438, in asyncio.run(main()) ^^^^^^ File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\asyncclick\core.py", line 1205, in call return anyio.run(self._main, main, args, kwargs, opts) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\anyio_core_eventloop.py", line 73, in run return async_backend.run(func, args, {}, backend_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\anyio_backends_asyncio.py", line 2001, in run return runner.run(wrapper()) ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\asyncio\runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\asyncio\base_events.py", line 653, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\anyio_backends_asyncio.py", line 1989, in wrapper return await func(args) ^^^^^^^^^^^^^^^^^ File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\asyncclick\core.py", line 1208, in _main return await main(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\asyncclick\core.py", line 1120, in main rv = await self.invoke(ctx) ^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\asyncclick\core.py", line 1485, in invoke return await ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\asyncclick\core.py", line 824, in invoke rv = await rv ^^^^^^^^ File "C:\Users\Downloads\privacy-scraper-playwright\main.py", line 369, in main await user.type(settings.user) ^^^^^^^^^^^^^ File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\dynaconf\base.py", line 145, in getattr value = getattr(self._wrapped, name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AppData\Local\Programs\Python\Python311\Lib\site-packages\dynaconf\base.py", line 328, in getattribute return super().getattribute(name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'Settings' object has no attribute 'USER'

— Reply to this email directly, view it on GitHub https://github.com/hammerheaddf/privacy-scraper/issues/2#issuecomment-1991760557, or unsubscribe https://github.com/notifications/unsubscribe-auth/AODR6WLAMTLMG5NFUHB3RRTYX4FEVAVCNFSM6AAAAABCYKWRK2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJRG43DANJVG4 . You are receiving this because you commented.Message ID: @.***>

ishimarumakoto commented 7 months ago

Pronto, finalmente eu tenho acesso! Vou abrir uns feature request pra manter na cabeça as coisas que to pensando e vou ver se abro um branch pra mim e depois faço os PR

ishimarumakoto commented 7 months ago

@hammerheaddf se tiver a ideia de como faz pra voltar a baixar e quiser compartilhar eu posso tentar implementar depois

ishimarumakoto commented 7 months ago

Comecei um esbosso pra tentar resolver aqui: https://github.com/hammerheaddf/privacy-scraper/commit/23b74d51379fe82d6b08a6b8399466160553388f

ishimarumakoto commented 7 months ago

Terminando aqui: https://github.com/hammerheaddf/privacy-scraper/commit/8f1da0e2a3efe2e1b83ec8325d2199de9cbd86b6

Acho que tá quase, tá baixando mas não tá ainda 100% porque deveria baixar 101 e baixou só 95 vou ver certinho se a logica ta certa mais tarde mas está baixando

ishimarumakoto commented 7 months ago

Resolvido com PR: https://github.com/hammerheaddf/privacy-scraper/pull/6

@hammerheaddf deixo o MD com o tutorial de como usar na sua mãe pq to com preguiça ahuaheuhaeu

ramas-jpg commented 7 months ago

Terminando aqui: 8f1da0e

Acho que tá quase, tá baixando mas não tá ainda 100% porque deveria baixar 101 e baixou só 95 vou ver certinho se a logica ta certa mais tarde mas está baixando

Velho, como posso baixar a alteração do código em especifico para substituir?

hammerheaddf commented 7 months ago

Vou publicar os PR daí vc pega eles.

Em sex., 15 de mar. de 2024, 19:12, ramas-jpg @.***> escreveu:

Terminando aqui: 8f1da0e https://github.com/hammerheaddf/privacy-scraper/commit/8f1da0e2a3efe2e1b83ec8325d2199de9cbd86b6

Acho que tá quase, tá baixando mas não tá ainda 100% porque deveria baixar 101 e baixou só 95 vou ver certinho se a logica ta certa mais tarde mas está baixando

Velho, como posso baixar a alteração do código em especifico para substituir?

— Reply to this email directly, view it on GitHub https://github.com/hammerheaddf/privacy-scraper/issues/2#issuecomment-2000571593, or unsubscribe https://github.com/notifications/unsubscribe-auth/AODR6WN6EEIYHB3CHLJEBA3YYNW4JAVCNFSM6AAAAABCYKWRK2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBQGU3TCNJZGM . You are receiving this because you were mentioned.Message ID: @.***>