MIR-MU / ARQMath-data-preprocessing

Preprocessed ARQMath competition datasets
2 stars 4 forks source link

When I run the command:dvc repro,the ERROR: unexpected error - module 'numpy' has no attribute 'int'. How to fix it? #2

Open aspnetcs opened 7 months ago

aspnetcs commented 7 months ago

https://github.com/MIR-MU/ARQMath-data-preprocessing. When I run the command:dvc repro,the ERROR: unexpected error - module 'numpy' has no attribute 'int'.

How to fix it?

PetrSojka commented 7 months ago

https://stackoverflow.com/questions/74946845/attributeerror-module-numpy-has-no-attribute-int Just use Python int instead of numpy.int

Petr Sojka

pá 5. 4. 2024 v 11:48 odesílatel aspnetcs @.***> napsal:

https://github.com/MIR-MU/ARQMath-data-preprocessing. When I run the command:dvc repro,the ERROR: unexpected error - module 'numpy' has no attribute 'int'.

How to fix it?

— Reply to this email directly, view it on GitHub https://github.com/MIR-MU/ARQMath-data-preprocessing/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOSPAJM2EWR4XEIG2HEUDDY3ZXNBAVCNFSM6AAAAABFY2ML2KVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIZDONJVG4YDONI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Witiko commented 7 months ago

While @PetrSojka's advice would be helpful if you saw a similar issue when running your own code, it doesn't help you much in the context of running a third-party library dvc.

In requirements.txt, we have an ancient version of the dvc library (0.92.0 from April 2, 2020), which apparently does not play well with the current version of the numpy library. You should either update the dvc library to the current version (3.49.0 from March 26, 2024) using the command pip install -U dvc or downgrade the numpy library to the version 1.18.2 from March 17, 2020 using the command pip install numpy==1.18.2.

Please, let us know if either of these options helped you fix your issue.

aspnetcs commented 6 months ago

When I run the upgrade command (pip install dvc --upgrade) and then run the dvc pull, the following error occurs:

(ARQMath-data-preprocessing) @.***:/home/ARQMath-data-preprocessing# dvc pull

WARNING: failed to collect 'workspace', skipping

'./output_data/ntcir/NTCIR12-Math-Wiki-Formula/NTCIR12-MathWikiFormula-queries-prefix-participants.json.dvc' validation failed: 2 errors.

extra keys not allowed, in outs -> 0 -> metric, line 3, column 3

2 outs:

3 - path: NTCIR12-MathWikiFormula-queries-prefix-participants.json

4 cache: true

extra keys not allowed, in cmd

How to fix it?

--

At 2024-04-06 05:55:09, "Vít Starý Novotný" @.***> wrote:

While @PetrSojka's advice would be helpful if you saw a similar issue when running your own code, it doesn't help you much in the context of running a third-party library dvc.

In requirements.txt, we have an ancient version of the dvc library (0.92.0 from April 2, 2020), which apparently does not play well with the current version of the numpy library. You should either update the dvc library to the current version (3.49.0 from March 26, 2024) using the command pip install -U dvc or downgrade the numpy library to the version 1.18.2 from March 17, 2020.

Please, let us know if either of these options helped you fix your issue.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

aspnetcs commented 6 months ago

NTCIR12-MathWikiFormula-queries-prefix-participants.json.dvc content md5: 41b3980b41121b480552d8f915d2f833

outs:

cmd: make NTCIR12-MathWikiFormula-queries-prefix-participants.json

~

~

~

~

~

--

At 2024-04-06 05:55:09, "Vít Starý Novotný" @.***> wrote:

While @PetrSojka's advice would be helpful if you saw a similar issue when running your own code, it doesn't help you much in the context of running a third-party library dvc.

In requirements.txt, we have an ancient version of the dvc library (0.92.0 from April 2, 2020), which apparently does not play well with the current version of the numpy library. You should either update the dvc library to the current version (3.49.0 from March 26, 2024) using the command pip install -U dvc or downgrade the numpy library to the version 1.18.2 from March 17, 2020.

Please, let us know if either of these options helped you fix your issue.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

aspnetcs commented 6 months ago

how to use Python int to replace numpy.int?

In which file or files, replace the numpy.int with a Python int

--

At 2024-04-05 23:29:57, "Petr Sojka" @.***> wrote:

https://stackoverflow.com/questions/74946845/attributeerror-module-numpy-has-no-attribute-int Just use Python int instead of numpy.int

Petr Sojka

pá 5. 4. 2024 v 11:48 odesílatel aspnetcs @.***> napsal:

https://github.com/MIR-MU/ARQMath-data-preprocessing. When I run the command:dvc repro,the ERROR: unexpected error - module 'numpy' has no attribute 'int'.

How to fix it?

— Reply to this email directly, view it on GitHub https://github.com/MIR-MU/ARQMath-data-preprocessing/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOSPAJM2EWR4XEIG2HEUDDY3ZXNBAVCNFSM6AAAAABFY2ML2KVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIZDONJVG4YDONI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Witiko commented 6 months ago

'./output_data/ntcir/NTCIR12-Math-Wiki-Formula/NTCIR12-MathWikiFormula-queries-prefix-participants.json.dvc' validation failed: 2 errors.

It would appear that the format of the *.dvc files has changed since dvc 0.92.0 from April 2, 2020. Then, we cannot upgrade DVC. Instead, you should try downgrading both dvc and numpy to the versions we used when we created this repo:

pip install dvc==0.92.0 numpy==1.18.2

Please, let us know if this helped to fix your issue.

how to use Python int to replace numpy.int?

In which file or files, replace the numpy.int with a Python int?

You would need to track down the module in the dvc library (or one of its dependencies) that uses the outdated method of numpy and patch it in your Python installation. Sounds too adventurous to me.

aspnetcs commented 6 months ago

when I run pip install dvc==0.92.0 numpy==1.18.2 ,the errors occur:

(olddvcnumpy) @.***:/home/ARQMath-data-preprocessing# dvc repro

WARNING: assuming default target 'Dvcfile'.

/root/.local/lib/python3.8/site-packages/scipy/init.py:143: UserWarning: A NumPy version >=1.19.5 and <1.27.0 is required for this version of SciPy (detected version 1.18.2)

warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"

ERROR: '/home/ARQMath-data-preprocessing/Dvcfile' does not exist.

How to make Dvcfile?

--

At 2024-04-07 15:48:56, "Vít Starý Novotný" @.***> wrote:

'./output_data/ntcir/NTCIR12-Math-Wiki-Formula/NTCIR12-MathWikiFormula-queries-prefix-participants.json.dvc' validation failed: 2 errors.

It would appear that the format of the *.dvc files has changed since dvc 0.92.0 from April 2, 2020. Then, we cannot upgrade DVC. Instead, you should try downgrading both dvc and numpy to the versions we used when we created this repo:

pip install dvc==0.92.0 numpy==1.18.2

Please, let us know if this helped to fix your issue.

how to use Python int to replace numpy.int?

In which file or files, replace the numpy.int with a Python int?

You would need to track down the module in the dvc library (or one of its dependencies) that uses the outdated method of numpy and patch it in your Python installation. Sounds too adventurous to me.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Witiko commented 6 months ago

/root/.local/lib/python3.8/site-packages/scipy/init.py:143: UserWarning: A NumPy version >=1.19.5 and <1.27.0 is required for this version of SciPy (detected version 1.18.2)

Then you may need to upgrade to numpy==1.19.5 using command pip install numpy==1.19.5 (or, if that leads to the numpy.int error again, look into downgrading scipy). Regardless, this seems like a warning rather than an error, so perhaps it does not needs to be solved immediately.

WARNING: assuming default target 'Dvcfile'. ERROR: '/home/ARQMath-data-preprocessing/Dvcfile' does not exist.

We have many separate *.dvc files in the repository such as output_data/ARQMath_CLEF2020/Task2/Formula_topics_cmml_and_pmml_V2.0.tsv.dvc. Perhaps you will need to run dvc repro for each of them individually:

dvc repro output_data/ARQMath_CLEF2020/Task2/Formula_topics_cmml_and_pmml_V2.0.tsv.dvc

Furthermore, dvc 0.92.0 seems to have an -R option for processing all *.dvc files in a directory recursively:

dvc repro -R .

However, please note that the scripts require datasets that you may not have installed, citing from README.md (emphasis mine):

Producing the preprocessed datasets

To produce the preprocessed datasets yourself,

Therefore, you may want to fetch the artefacts using dvc pull and use our code only as a documentation rather than something that can be easily executed without significant modifications.