issues
search
htrc
/
htrc-feature-reader
Tools for working with HTRC Feature Extraction files
39
stars
12
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
FutureWarning in feature_reader.py
#49
cskemp
opened
1 year ago
0
HTTP Error with Volume()
#48
melaniewalsh
opened
1 year ago
6
Allow retaining JSON-LD column names
#47
bmschmidt
opened
2 years ago
2
Arrow transfer
#46
bmschmidt
opened
2 years ago
0
Access denied via Volume()
#45
Ori-Pixel
opened
2 years ago
6
Specify readme encoding for windows support
#44
bmschmidt
opened
3 years ago
0
pyarrow roadmap
#43
bmschmidt
opened
3 years ago
7
Reducing memory consumption in the core wordcount loop
#42
bmschmidt
opened
3 years ago
4
Change indexes returned by pd.read_parquet(tokenlist) (not public API).
#41
bmschmidt
opened
3 years ago
1
Store parquet volume metadata on the parquet tokencounts file
#40
bmschmidt
opened
3 years ago
3
Provide arrow_counts method on volume to bypass pandas
#39
bmschmidt
opened
3 years ago
4
Installation on Windows
#38
younbaek
opened
3 years ago
4
Update EF size information
#37
organisciak
closed
4 years ago
0
Closes #35
#36
organisciak
closed
4 years ago
0
Option to suppress printing of paths/ids at FeatureReader instantiation
#35
wilkens
closed
4 years ago
2
Feature Reader 2.0
#34
organisciak
closed
4 years ago
0
File handler refactor
#33
bmschmidt
closed
4 years ago
0
Feedback on new parsing strategy
#32
bmschmidt
closed
4 years ago
6
Remove ujson dependency from setup.py?
#31
bmschmidt
closed
4 years ago
3
Parquetreader
#30
organisciak
closed
5 years ago
1
Fix for change in EF Access API
#29
borice
closed
5 years ago
1
Include convenience function for displaying readable htids in IPython notebooks.
#28
bmschmidt
closed
6 years ago
4
Parse MARC-XML from HT
#27
organisciak
closed
6 years ago
0
Remove Solr metadata lookup
#26
organisciak
closed
6 years ago
0
Add doc string to utils methods
#25
organisciak
closed
6 years ago
0
Sortlevel deprecated
#24
organisciak
closed
6 years ago
0
Add argument for EF endpoint
#23
JaimieMurdock
closed
4 years ago
1
Feature request: volume availability check
#22
rburke2233
closed
6 years ago
1
Add data corrections argument
#21
organisciak
closed
4 years ago
0
Page image module
#20
organisciak
closed
4 years ago
0
Online id-based read
#19
organisciak
closed
7 years ago
0
Version 1.90 incorrectly labelled 1.9
#18
organisciak
closed
7 years ago
0
Docs refer incorrectly to PD
#17
organisciak
closed
7 years ago
0
`htrc_features.utils.download_file(silent=True)` does not suppress rsync output.
#16
JaimieMurdock
closed
7 years ago
0
Integrate volume id and record id parsing from HTRC-PythonSDK
#15
organisciak
closed
4 years ago
0
Remove multiprocessing code
#14
organisciak
closed
7 years ago
0
Online initialization
#13
organisciak
closed
7 years ago
2
Better __repr__ for FeatureReader and Volume
#12
organisciak
closed
7 years ago
0
Reimplement term-document matrix
#11
organisciak
closed
4 years ago
0
Books with high word/page count not reading tokens
#10
organisciak
closed
7 years ago
0
feature request: wrap rsync downloads inside module.
#9
bmschmidt
closed
7 years ago
11
feature request: __getitem__ (or similar) for page ranges
#8
senderle
closed
4 years ago
5
Update Feature reader to support schema 3.0
#7
organisciak
closed
8 years ago
0
Bad key in volume_term_freqs with lowercase characters
#6
organisciak
closed
4 years ago
1
Docs are not found
#5
keithfma
closed
8 years ago
2
syntax error in htrc_features/feature_downloader.py
#4
keithfma
closed
8 years ago
4
README multiprocessing code missing
#3
organisciak
closed
8 years ago
0
Does htrc-feature-reader fully support HTRC schema 2.0?
#2
wilkens
closed
8 years ago
8
fixed margins and syntax highlighting
#1
edsu
closed
10 years ago
0